EDUCATIONAL and PSYCHOLOGICAL 


EASUREMENT 


Volume I 
1941 



Published by 

SCIENCE RESEARCH ASSOCIATES 
1700 PRAIRIE AVENUE « CHICAGO, ILLINOIS 






educational and psychological 

MEASUREMENT 

A quarterly journal devoted to the development and application nt 
measures of individual differences. 


EDITOR 

G. Frederic .. Social Sc ™ rit * Boanl 


ASSOCIATE EDITORS 


Dorothy C. Adkins. Social Security Board 

Forrest A. Kingsbury. University of Chicago 

M. W. Richardson. United States Civil Service Commission 


BOARD OF COOPERATING EDITORS 


P. J. Rui.on 

Harvard University 


Richard D. Allen 

Providence Public Schools 

John G* Darley 

University of Minnesota 

Harold A. Edgerton 
Ohio State University 

Max D. Engelhart 
Chicago City Junior Colleges 

E* B. Greene 

University of Michigan 

J. P. Guilford 

University of Southern California 

E, E. Lindquist 

State University of lovja 


David Segel 

U. S * Office of Education 

C. L* Shartlh 
Social Security Board 

H. C. Taylor 

IP ester n Electric Company 

Thelma G. Thurstons 

Chicago Teachers College 

Herbert A. Toops 
Ohio State University 

E. G, Williamson 

University of Minnesota 


Ben D. Wood 

Columbia University 


The journal is open to (1) reports of research on the development and u*e 
of tests and measurements in education, government, and industry, (2) de&crip" 
tions of testing programs being used for various purposes, (3) ducimltm* 
of problems of measurement in general oi in specific fields, and (4) itlneM 
laneoua notes pertinent to the measurement field, such as suggestions of new 
types of items or improved methods of treating test data* Manuscripts should 
be sent to EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT, 
1700 Prairie Avenue, Chicago, Illinois 

EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT is published 
quarterly by Science Research Associates, 1700 Prairie Avenue, Chicago, Illinois. 
Subscription rate, $+00 a year. Entered as second class matter June 11, 1941, 
at the Post Office at Chicago, Illinois, under the Act of March 3, 1879* 







INDEX FOR VOLUME I 


Andrus, Lawrence 

A Composition Test For Foreign Languages.355 

Blakey, Robert L 

A Factor Analysis of a Non-Verbal Reasoning Test.187 

Bardin, E, S and Williamson, A. G. 

The Evaluation of Vocational and Educational Counsel¬ 
ing; A Critique of the Methodology of Experiments. ... ^ 

Bordin , E. S. and Williamson, E. G . 

An Analytical Description of Student Counseling.341 

Bordin, E. S. and Sorbin, T. R . 

New Criteria For Old, .173 

Cron bach, Lee A 

The Reliability of Ratio Scores...269 

Darley, John CL 

Counseling on the Basis of Interest Measurement.35 

Edgerton, Harold A. and Ellison, Mary Lou 

The Thurstone Primary Mental Abilities Test and Col¬ 
lege Marks.399 

Ellison, Mary Lou and Edgerton> Harold A. 

The Thurstone Primary Mental Abilities Test and Col¬ 
li ge Marks.399 

Engelhai t, Max I) and Lewis, Hugh B. 

An Attempt to Measure Scientific Thinking.289 

Harrell, IVillard and Faubion, Richard 

Primary Mental Abilities and Aviation Maintenance: 

Courses . 59 

Haitson, L. I), and Sprow, A. A 

Tiie Value of Intelligence Quotients Obtained in Sec¬ 
ondary School for Predicting College Scholarship.387 

Hiskey, Marshall 8 

A New Performance Test for Young I)laf Children.217 

Hoyt , C\ A 

Non: on a Simplified Method of Computing Test Relia¬ 
bility . 93 

Kopas, Joseph A\ 

Guiding Students to Become Self-Guiding.279 

Koran, Sidney 

Performance Testing in Public Personnel Selection, 

P \rt 1.235 

Koran, Sidney W. 

PivRIORMANCK TESTING IN PUBIHC PERSONNEL SELECTION, 

PARr II.365 

Ruder , G\ Frederic and Shanncr, IL'illiam 71A 

A Comparative Study of Freshman Week 'Tests Given at 

thi. University of Chicago... .. ,85 

Lewis, Hugh B. and Enqelhart , Max D . 

An Attempt to Measure Scientific Thinking.289 

Lorr, Maurice and Mcuter, Ralph K. 

Tins Concept of Scatter in the Light of Mental Test 

Theory . 303 

McCall, William C . and Traxler, Arthur E. 

Some Data on the Kuder Preference Record ,. *. . ... ,253 

ii! 




















1 HE CONCEPT OF SCATTER IN THE JjICHT OF MENTAL TlST 

Theory... 

Meister, Ralph K. and Reymert , Martin L, 

A Comparison of the Original and Kevisid Stanford Him i 

Intelligence Scales . . . *.6? 

Mosierj Charles L 

A Short Cut in the Estimation of Split-H\lvls (‘f'nirr- 

ENTS.4**7 

Munson, Giace 

The Course in Self-Appraisal and Carters Oirnan in 

Seniors in Chicago Public Schools. .4a 

Powell, Nonnan J. 


Examining Examiners . js; 

Reymert, Martin L . and Meistn, Ralph K. 

A Comparison of the Original and Revest p Sjanhlid Hivi t 

Intelligent Scales .,. , jy; 

Richardson, M W. 

The Logic of Age Scales. ,. 2^ 

Sandt, Karl E> and Tiiggs , Frances Oraltnd 

An Evaluation of Techniques of Measuring Viku \l Arri i v 

at the College Level... 

Sarhiiij T. R. and Poulin, E. S. 

New Criteria for Old ,.. .. 1 

Schneidletj Gwendolen G. 

Grade and Age Norms for the Minnesota VocaiionTj - r 

for Clerical Workers... 

Shanner, Willmsn M, and Knder , G. Fre/lrr ir 

A Comparative Study of Freshman Wftk Tivis fJiwv vr 

the University of Chicago. ...... 

SproWj A. J, and Hart son, L. D. 

The Value of Intelligence Quotients Obtain!!) iv Sm- 
ondary School for Predicting College Scholars)!!! 1 . .. . V;? 

Stmt J Dewey B. 

The Prediction of Scholastic Success in a Cm.u.c.r m Mini 

cine. 

Thurstone, Thelma G. 

Primary Mental Abilities of Children 
Trailer, Arthur E ' ' ’ ' ’. . % 

Cumulative Test Records: Their Nature and Uses. . . 

Tracer J Arthur E. and McCall , William C. 

Some Data on the Kuder Preference Record . 

Inggs, Etances Oralind and Sandt, Karl F . 

A So 0 :SS””™*° F M " AS ™ N0 V.n<m ton v 
Tyler, Ralph W. .*. . 

Contributions op Tests to Research in the Fjkld of Srt* 

DENT Personnel Work . 01 hn ", „ 

rr tlliamson, E, G , and Bordin E S * LL? 

Ne W A Te A s m ^L DESC “" I0 . M of . Student Counseling. Ml 

Measurement Abstracts . 109 

Measurement News .205. .Ill, 409 

. 318 


. / < 
. Urt 

.323 

.25.1 


IV 




















PRESENTING A NEW JOURN Al 




The interest and activity in the field of m 
human characteristics have never been greater than rement 
with this thought that the editors present the first ^ ^ * S 

cational and Psychological Measurement. Educat ’ SSl16 
tions, government, and industry are all giving fn c . institu¬ 
tion. to methods of evaluation aimed at determining tl SU ^ atten " 
promise of the individual. Improved methods in ^ Status anc * 
are being developed and significant research i« ^ ea Surement 
many fields. 6ln ^ ^ one * n 


In spite of this rising interest, measurement is ■ 
child. The contributions of measurement theory and Stl ^, a s * e P" 
found expression in the publications devoted primal ^ ^ ave 
fields. Nowhere has there been a common meetin ^ t0 0t * ler 
the exchange of ideas from area to area except f ^ r ° lU1( ^ * or 
technically inclined. 0r 


more 


Yet there are measurement problems of practical 
diate concern which are common to many fi e tg s * mme ’ 

of estimating future success is common, for example t 6 ^ r °^ em 
of helping young people choose appropriate vocati 0 n s ° ^ ^ aS ^ S 
ing employees, of admitting students to educational•’ °*. se ^ ect ' 
and assigning draftees to jobs in the Army. institutions 

The limited interchange of ideas and technique 1 

ment probably can be explained by the fact that there measure “ 

no single journal which could be counted upon to t e 6 38 ^ een 

developments and to serve as a forum for the dNf>»* ■ rt cyrrent 
1 SCUss ion of prob- 



lems. It is our purpose to remedy this situation. The pages of 
Educational and Psychological Measurement will be open to con¬ 
tributions from all fields in which techniques of human measure¬ 
ment are used. Each issue of the journal will have departments 
devoted to news and abstracts of recent literatuie. Future issues 
will also carry a section on new tests. 

It is hoped that the articles in the journal will not only be 
of interest to readers in the specific areas from which the articles 
come, but that they will be suggestive of improved procedures 
elsewhere. 


Washington, D. C. 
December 23,1940. 


G. F. K. 



THE EVALUATION OF VOCATIONAL AND 
EDUCATIONAL COUNSELING: 

A CRITIQUE OF THE METHODOLOGY OF 
EXPERIMENTS* 

E, G. WILLIAMSON AND E. S. BORDIN 
Univeisity of Minnesota 

With increasing attempts to systematize the concepts of coun¬ 
seling, to describe its techniques* and to delineate its objectives* the 
need for evaluative studies has become more insistent. Descriptions 
of programs of vocational and educational counseling usually close 
with a summary statement that further improvement in this field 
is dependent upon evaluative studies (40: chap. XXVII, 42, 43: 
chap. IX, 44i). In other words, currently used techniques of coun¬ 
seling must be subjected to scrutiny and evaluation in order that 
more effective ones may be developed. Thus a fertile field for ex¬ 
perimentation may be found in this phase of student personnel work. 

Restricting Conditions 

A review of the peculiai conditions of this field of applied psy¬ 
chology is in order and should precede attempts to experiment. This 
paper will attempt to summarize, in a critical and systematic man¬ 
ner, the assumptions, criteria, methods of measuring outcomes, and 
possible experimental designs involved m the evaluation of educa¬ 
tional and vocational counseling. The tieatment of personality, 
social, family and other types of students* problems will be con¬ 
sidered only in relationship to educational and vocational adjust¬ 
ment. The evaluation of these other types of counseling—usually 
called personality counseling—should be the subject of another 
paper. 

When we speak of counseling, we refer to individualized efforts 
to help students discover vocational assets and disabilities and to 
plan an appropriate training program. The making of such an in¬ 
ventory of potentialities must be preceded by the collection and use 
of evidence of abilities, interests and motivations. The techniques 
involved in collecting, refining and using evidence have been de¬ 
scribed elsewhere (40: chap, III), 

*The report of a statistical evaluation of clinical counseling by the 
same authois will appear in the next issue of this journal. 

fSee pages 22-24 for references. 


5 



EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


For purposes of evaluation experiments, it is necessary also to 
agree as to what counseling is not, to define it negatively. For 
example, we cannot accept the assumption that testing alone or that 
statistical prediction is counseling. Such would seem to be the 
assumption of Thorndike’s 1934 study (35) as an evaluation of 
counseling techniques. On the other hand, attempts to define coun¬ 
seling as self-analysis (by students) or as diagnosis based alone 
upon impressions, student hopes and interview data (by counselors) 
are equally unacceptable. Counseling must be based upon an under¬ 
standing of the student; but the counselor does more than make a 
diagnosis or prediction. Counseling is the process of helping the 
student to plan, and to utilize his assets. 

Progress toward adequate evaluation of counseling has been 
impeded by two types of attitudes held by some personnel workers. 
Some counselors evaluate by means of arm-chair methods. That is, 
the effectiveness and general worth of counseling is held to be self- 
evident. These persons reason that the general methodology of 
guidance must be effective because it appears to be an appropriate 
method of dealing with serious and widespread maladjustment 
among youth. Other personnel workers appear to believe that coun¬ 
seling cannot be evaluated. They maintain that the counseling 
process is so personal and individual that any attempt by the coun¬ 
selor to study it will impair his efficiency as a counselor and will 
create an artificial situation which will not even remotely resemble 
the real counseling relationship. 


On the other hand, those who believe that counseling can and 
should be evaluated have taken one of three approaches. First, 
there is the approach which clings to traditional statistical meth¬ 
odology in utilizing only those criteria that are objectively quanti¬ 
fiable. This approach is based upon the premise that a straight- 
forward statistical analysis of such data as grades, years in col We 
number of jobs held or wages earned, are sufficient criteria for 
e— experiments Second „ the approach which utilize., 
non-statistical case study methods of evaluation. The third ao- 
proach attempts to avoid the objections to the other two methods 
by using various objective and systematically derived criteria 
w ich are combined by means of impartial judgmental treatment 
in contrast with statistical summations. 


The assumptions underlying criteria should be made oxolicit 
Implicit assumptions have been the source of error in pWn, 1 ‘d 


6 



VOCATIONAL AND EDUCATIONAL COUNSELING 


guidance. Here again the interpretations, of Thorndike’s, .study (35) 
serve as an example, although others might also be used. The con¬ 
clusion, drawn by many from Thorndike's study, that counseling 
was low in effectiveness, would not be objectionable if the interpre¬ 
ters had indicated that by guidance they mean statistical prediction 
of fragmented criteria. In speaking of prediction of fragmented cri¬ 
teria we refer to the fact that many reseatch workers lose sight of 
the possibility that one datum often has different meaning and sig¬ 
nificance for different students. If such is the case, and we have 
every reason to believe that it is, then any attempt to use these bits 
of information either separately or in a rigid arithmetic combina¬ 
tion may obscure the actual outcomes of counseling. 

The supposition that specific objectives, such as an increase in 
academic achievement, will necessr.iily be common to all the cases 
in an experimental population must also be examined. If we cannot 
accept the supposition, then we must consider the possibility that 
the use of what is at best a partially applicable ci iter ion is likely to 
reveal only slight differences, if any at all. For example, a low 
aptitude student who had been successfully counseled into with¬ 
drawing from college cannot be included in an experiment designed 
to reveal the effectiveness of counseling in increasing grades. 

There are two other considerations of this type that the careful 
research worker must consider in planning an effective evaluative 
experiment. First he must realize that m otdot to evaluate a pro¬ 
gram of action, it must he earned out. I he student must do some¬ 
thing following counseling m oulci to make evaluation possible, 
A physician might just as well attempt to discover the effective¬ 
ness of his medicine when his patient has taken it home and placed 
it unused in his medicine cabinet. Secondly, a counselor may 
change a students attitudes, but these must be revealed in observ¬ 
able or measurable behavior or they cannot be evaluated. Any out¬ 
come that is beyond the scope of some means of dependable obser¬ 
vation is one that cannot be dealt with and therefore must he re¬ 
jected by those who require more than blind faith. 

The question of the optimum time interval for evaluation is one 
that needs further investigation before much progress can be made 
in evaluation experiments. It is possible that the optimum time 
interval will vary for each individual in any experimental group; 
or perhaps the longer the intervening time, the greater the possi¬ 
bility for the intrusion of other influences that may tend to mini- 


7 



EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


mize the effects of counseling 1 . Some influences may facilitate ad¬ 
justment subsequent to counseling; others may cause maladjust¬ 
ment. Even though counseling results in a distinct separation of 
counseled from non-counseled in terms of subsequent adjustment, 
the randomization of subsequent influences may cause a regression 
toward the mean for both groups. 

The scant knowledge of specific counseling techniques has 
forced us to study the effectiveness of the total process. If certain 
techniques neutralize the effect of others, then the gross results 
would be negligible. As specific techniques are isolated and de¬ 
scribed, then new types of evaluation studies may replace the pres¬ 
ent gross experiments. Such studies, however, would not appear to 
be possible until more adequate descriptions of techniques are made 
available by those who actually counsel students. 


Formulating Hypotheses 

Counseling can be evaluated only if certain outcomes or criteria 
of effectiveness are assumed to result from the counseling process. 
These assumptions must be formulated as hypotheses to be “tested" 
by experimental and statistical analyses. But a second considera¬ 
tion is of equal importance. We must determine not only the results 
of counseling but, as in all scientific studies, the conditions under 
which these outcomes will be produced. We must answer this sec¬ 
ond question in terms of what kinds of counseling, what techniques, 
what types of counselors and work with what types of students will 
produce certain outcomes. Our problem, broadly speaking, then 
becomes, "What counseling techniques (and conditions) will pro¬ 
duce what types of results with what types of students 

Most counselors have empirically derived opinions, hunches and 
judgments as to what outcomes or effects they and the students 

7nr\ i!°f a< ? ieV€ fi But many of these ou tcomes are intangible 
and difficult to formulate as well as difficult to set up in an expen- 

men Ml d=s. gn . We may, however, achieve some degree of agreo- 
mimptions- PUtPOSeS ° £ " r * erimentali ™. on the following as- 

Effective counseling wdl lead to or result in* 

*33S5S3SS5£a 

range) aptitude than he possesses (actually and potentially). 



VOCATIONAL AND EDUCATIONAL COUNSELING 

3. The student will make reasonable progress toward this 
goal (in training school). 

4. The student will be “satisfied” (further motivated) by 
that progress and with his chosen goal. 

In order to achieve these outcomes it is necessary that; 

1, The counselor secures the student’s cooperation (rap¬ 
port in the broad sense) in choosing (orienting himself 
toward) a goal and the means to it; the desire to assay his 
assets and interests. 

2, The student generatGvS enthusiasm to use his assets in 
attempting to secure relevant training and to achieve the 
chosen goal, 

3. The student uses his aptitudes skillfully in securing 
training in school. 

4. The counselor and the student are able to alleviate, 
relieve or remedy pressures and disabilities—family, financial, 
emotional, etc.—which interfere with or prevent the eager 
and skillful use of aptitudes and the choice of an appiopriatc 
goal 

5. If these pressures or disabilities are too serious for 
the counselor to cope with, then use is made of specialized 
personnel workers, 

6, The appropriate or reasonably approximate type of 
training is available to the student. 

The above possible outcomes may be the direct or indirect, 
immediate or long-term outcomes of counseling. They may reveal 
themselves or be observed indirectly and not always by means of 
the student’s verbal report to the counselor. For example, the stu¬ 
dent’s orientation may be revealed in his classroom grades, Some 
outcomes may be general in nature (results of any type of counsel¬ 
ing technique), and others may be highly specific, Likewise some 
techniques may produce one or more of the above outcomes when 
used with any type of student having any type of problem, Other 
techniques may be highly specific. Much experimentation needs to 
be done before we can answer these subsidiary questions. It is most 
likely that counseling cannot be equally effective with all types of 
students and all types of conditions. 

Experimental Designs 

Drawing upon empirical knowledge, we may describe the gen¬ 
eral outlines of a number of possible experiments which should 
reveal some of the outcomes of counseling, Wc shall restrict our¬ 
selves to the following possible criteria; academic achievement, 
appropriate choices, cooperation, satisfaction, success, quality of 
case work, predictive efficiency, composite criteria. 

Academic Achievement. The emphasis placed upon grades in 


9 



EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 

educational circles has necessarily established them as the most 
used criterion of the effectiveness of counseling. Most colleges and 
universities drop unsatisfactory students on the basis of their aver¬ 
age grades, and reward those who achieve high marks, There are 
two methods of experimental design applicable: the comparison of 
the student’s grade average before and after counseling (2, 4, 8, 19, 
24, 43: chap. IX); or a comparison of the average grade of coun¬ 
seled students with that of non-counscled students who have been 
matched for such characteristics as age, sex, level of ability, size 
and type of high school and high-school grades (13, 15, 20, 27, 
41, 42). 

Both methods of control have definite weaknesses. First of all 
it must be emphasized that grades are patently only one of the pos¬ 
sible desirable outcomes of counseling. In addition their leliability 
and validity, as a measure of scholastic achievement, have been 
seriously questioned. Of more importance are the dissimilarities in 
patterns of subjects taken by different students. This condition 
makes the criterion of average grade a shifting scale whose com¬ 
parability from student to student is questionable. Moreover, in 
cases where the student has been successfully advised to leave col¬ 
lege there will be no subsequent grades to evaluate, In the case of 
students counseled before matriculating in college, where no pre¬ 
counseling grades are available, this method is not at all applicable. 

The method of control by matching is a traditional one in scien¬ 
tific experimentation. It theoretically provides us with a compar¬ 
able population for comparing the effect of counseling with the 
effect of normal” (or random) conditions. At the present time 
however, it is impossible to match individuals on the very factors 
that may be of importance, e.g„ motivation, personality o, emo¬ 
tional stability. In addition, it is difficult to collect a reasonabR 
number of cases which will be matchable. While the method of 
internal control, i.e., comparing grades before and after counseling 
does away with the matching problem, it leaves indeterminate the 

ses itc °‘ “ —- ■” "S E 

t™ T £, e “7* s,andardized “Chievemtm tests is a possible altoraa. 


10 



VOCATIONAL AND EDUCATIONAL COUNSELING 

their objectives and college subjects this factor of heterogeneity 
will be a possible disturbance in the use of scholastic criteria 0 
counseling effectiveness. Experiments should be made to determim 
the possible relevancy and validity of this type of criterion. 

Educational and Vocational Choices. When evaluating ii 
terms of educational and vocational choices it may he assume* 
that the individual will achieve a more satisfactory life adjust 
ment if he sets goals for himself that are neither too high not 
too low for his potentialities (18, 32). Thus the task of the 
counselor is conceived to be, in part at least, to bring about con¬ 
gruence between those two factors. 

For any case we may compare the student's statement of his 
objectives with his potentialities as judged from test data and rele¬ 
vant tryout expeiiences. The judgment of the degree to which bet¬ 
ter alignment has been achieved as a result of counseling may be 
made by the counselor himself, by an outsider who tends the case 
notes, by the student, or by all three persons. In favor of the former 
procedure, one may contend that there arc* often subliminal data 
not included in the case record which would make the counselor's 
judgment most accurate. On the other hand, we may encounter 
difficulty in separating judgment from desire since the counselor is 
not disinterested in the outcome. An added difficulty with this type 
of criterion is the frequency of student cases in which a temporarily 
uncertain choice is the most desirable outcome of counseling. 

An indued measure of this criterion may he used if we assume 
that more information on educational and vocational topics will 
lead to a greater probability of congruence between aspirations and 
potentialities. It seems legitimate to expect the clinical counselor 
to aid the student in acquiring such information, although this type 
of function has usually been involved in group guidance procedures. 
For the appraisal of these two types of outcomes, tests and inven¬ 
tories of the Kefauver-Hand type may be used (1?). By these 
means it may be possible to determine whether counseled students 
have more information on which to base their educational and voca¬ 
tional decisions than they had before counseling or than is pos¬ 
sessed by a matched uncounseled control group. Since the mere 
possession of occupational and educational information is not a 
major objective of counseling, experiments are needed to deter¬ 
mine the relationship between the possession of such information 
and the appropriateness of the choices made by students Such 
crucial experiments have not yet been made in support of the 
relevancy for counseling of courses in occupational information. 


11 



EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


Cooperation with the Counselor . This criterion is based upon 
the premise that effective results of counseling cannot be achieved 
unless the counselor is en rapport with the student. The fact that 
the student receives the advice of the counselor cooperatively is 
taken as an indication of rapport and therefore as a criterion of the 
effectiveness of counseling. Viteles seems to go even further when 
he says; “That advice is followed is probably in itself an evidence 
of satisfactory adjustment’' (38: p. 75). Such a contention needs to 
be evaluated experimentally. We would reject any attempt on a 
physician’s part to prove the efficacy of a particular medical treat¬ 
ment by means of evidence that the patient cooperates in submit¬ 
ting to that treatment. We would certainly withhold judgment 
until we ascertained whether his patient eventually had recovered 
or died. Cooperation is a desired outcome of counseling but 
chiefly as a means or condition necessary to other more basic out¬ 
comes, In this sense it is a preparatory outcome or criterion of 
counseling effectiveness. 

The measurement of this criterion would be expressed in terms 
of the percentage of the group counseled that had shown various 
degrees of cooperation. Such a result is difficult to interpret since 
there is no standard for determining what either a statistically or a 
socially significant percentage would be. Further experimentation 
and experience would, of course, provide data for deriving such a 
standard. 


The Student’s Satisfaction. Satisfaction of the student is 
deemed to be a desirable outcome of counseling. This satisfaction 
may embrace his educational and vocational objectives, the counsel¬ 
ing assistance, and finally the job that he ultimately secures* The 
student's satisfaction with any of the three may be inferred from 
his verbal report, either on an interview basis or by means of an 
attitude test Obviously many subtle or delayed satisfactions may 
not be readily observed or felt by the student Dissatisfaction which 
results from frustration may be, and oftentimes is, followed by later 
reconciliation to substitute adjustments. 

Concerning satisfaction with educational and vocational obicc- 
wes as criteria, two methods of control ma y be used. The satis- 

or th. -T ! e ? may be measured before and after counseling 
or the satrsfacuon of a counseled group may be compared to that of 

assistances ” T CaSe ° f satisfaction with counseling 

assistance (25. 39) neither of these methods is possible. To mens- 

ure a student’s satisfaction with counseling assistance before he has 

been counseled or when he has not been counseled wouldTe mean- 


12 



VOCATIONAL AND EDUCATIONAL COUNSELING 

ingless. We can only determine the percentage of students who ex- 
pressed degrees of satisfaction with the counseling assistance 
received and compare the results for two or moic counseled gioups. 
In a sense this criterion is usable to determine which of two or 
more counseling methods, or counselors, is more effective* 

The systematic and quantitative data provided by the attitude 
scale technique have not as yet been exploited in the evaluation of 
counseling. There are three types of attitude scales that may be 
used, First, a scale measuring the student's attitude toward the 
school and his educational training, Bell has already described 
such a scale for high school students (3: p. 117-23). Second, a scale 
measuring the student’s attitude toward his vocational objectives. 
Remmers has constructed such a scale and has used it in the apprai¬ 
sal of the effectiveness of group guidance (28, 29). Third, a scale 
measuring the student’s attitude toward the counselor and the 
counseling assistance. This type of scale has had practically no 
application. In fact we have found only two instances of its use 
reported in the literature (14, 23). The usual appioach has been 
through the report of the individual to direct questioning. 

While the student’s report is the easiest way to determine satis¬ 
faction and cannot be ignored as one type of satisfaction response, 
it has many weaknesses. For example, it may conceal real dissatis¬ 
faction behind a rationalization process. It may he a reflection of 
dissatisfaction in some other area than education or vocation! c,g., 
social, recreational, sex. The desire to please the counselor because 
of fixation or gratefulness may lead to a report of satisfaction, In 
some cases it seems too much to expect a feeling of complete satis¬ 
faction even with the most successful counseling, since a counselor 
cannot be expected to overcome the false hopes of a lifetime in a 
relatively short period of time, If the individual’s stratum of .society 
requires a level of aspiration far beyond his capabilities, the coun¬ 
selor cannot be expected to bring about complete and immediate 
satisfaction. 

Satisfaction with a job has been the most frequently used cri¬ 
terion of the effectiveness of vocational counseling (4, 5, 6, 16, 22, 
23, 25, 30), In addition to the direct report of the student, scores 
on the Hoppock Job Satisfaction Blank and the number of voluntary 
shifts in jobs have been used as measures of job satisfaction. All of 
these criteria lend themselves to the use of both an internal and a 
matched control. But many objections are encountered to satisfac- 


13 



EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 

tion with the job, measured in any manner. Job dissatisfaction may 
reflect dissatisfaction with the low starting salaries which are char- 
acteristic of most jobs rather than with the occupational choice 
resulting from counseling. The dissatisfaction may also be caused 
by local conditions on the job, e.g., an unpleasant supervisor, un¬ 
companionable workmates, instead of maladjustment to the work 
involved. Likewise there are special objections to the use of shifts 
in employment as a criterion, since it is difficult to distinguish a 
voluntary from a forced shift, A shift, as measured, is an all-or- 
none process and time does not allow for a measure of degrees of 
satisfaction or promotion. 

The method of internal control with job satisfaction as the cri¬ 
terion is different from the one previously outlined. In this case 
before-after comparisons are not applicable. Instead, those who are 
in an advised occupation are compared to those who are not or 
with those who are in an occupation not discussed with the coun¬ 
selor, For this control to be meaningful the categories of occupa¬ 
tions must be broadly interpreted according to their general 
functions. 

Success on the Job. This criterion assumes that effective 
counseling should lead students to seek and secure jobs in which 
they can be successful. It can be measured by employer's tepotts, 
number of advancements, number of forced shifts and wages earned. 
The controls applicable are the same as those for the criterion of 
job satisfaction (4, 9, 16, 22, 25, 30). 

The use of a success criterion has at least four general weak¬ 
nesses. First of all, success is a relative matter, relative to the stu¬ 
dent’s ambitions and to the reactions of his social group to his 
achievements. Secondly, success may come years later with many 
other factors, unrelated to the original counseling, intervening to 
cause it. Success in school is a more immediate adjustment the 
student must make before the vocational adjustment is necessary. 
Thirdly, some students advance vocationally more quickly because 
of aids from parents or friends and not because of counseling. 
Finally, this criterion is complicated by the influence of the quality 
of placement work in the senior year of training and is only re¬ 
motely a criterion of counseling in the freshman year. 

Each of the methods of estimating this criterion has been seri¬ 
ously criticized (33, 34). Employer’s reports may be subject to 
error because of the influence of an adverse personal relationship 

14 



VOCATIONAL AND EDUCATIONAL COUNSELING 

between employer and employee unrelated to quality of work, be¬ 
cause of the state of the labor market or because of atypical suc¬ 
cesses or failures at the time of the follow-up interview or ques¬ 
tionnaire. Quite often there will be problems of locating the 
employer, especially when the student has experienced a number of 
shifts within a short period of time (9). The absence of standards 
for comparison and the difficulty in securing cooperation are also 
contributory factors to the unreliability of employers' reports. 

Number of advancements in employment may be unsatisfac¬ 
tory as a criterion of success because the best occupation for an 
individual may be one in which there are few opportunities for 
advancement. In addition, in most cases advancement occurs over 
a long period of time. The longer the intervening time, the more 
difficult it is to determine whether the original counseling has 
been the decisive factor rather than any of the many intervening 
influences. Likewise, the number of shifts in employment pre¬ 
sents drawbacks because of the difficulty in distinguishing volun¬ 
tary from forced shifts and the all-or-none nature of shifts in 
jobs. 

Paterson and Darley (26; p. 19) and, more recently, Lurie (21) 
have presented evidence which indicates that shifts in jobs may 
not always be reliable indices of the individual’s adjustment. 
The older study found that the number of job changes did not 
discriminate wot kerb unemployed early in the depression from 
those unemployed late. Lurie found that workers discharged 
during retrenchment were, as a group, as capable as those 
retained. 

In order for compaiisons on the basis of wages earned to be 
meaningful, it is necessary to compare individuals who are work- 
ing on jobs where comparable wage scales prevail, a difficult task. 
Another objection is that wages may reflect extra-individual 
conditions beyond the scope of the counselor’s function. 

Quality of Case Work . The type and appropriateness of the 
various procedures and techniques used by the counselor are 
assumed to be the marks of good counseling, Studies using such 
criteria are, however, to be considered as preparatory to final 
studies of the effectiveness of guidance. It should be recognized, 
however, that unless thorough-going methods are used there is 
little point in making an experimental evaluation (42), 

A critical analysis of the techniques used by the counselor 
and a critical reading of case history and interview notes are the 
most feasible methods to determine their appropriateness (7, 39), 


15 



EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 

An unbiased but well-informed “outside” judge would seem to be 
the most desirable agent to perform such an analysis of case 
records. There is a difficulty here in that a well-informed judge 
is likely to be one who has had counseling experience himself 
and is, therefore, unlikely to be free of convictions This method 
is at best a rough measure of whether the counselor used the 
particular techniques judged appropriate by other counselors. No 
measure of the effectiveness of these techniques results from the 
use of this criterion. 

Predictive Efficiency. The efficiency of educational diagnosis 
by the counselor may perhaps be studied more accurately. One 
possible experimental setup would compare the efficiency of pre¬ 
diction for pre-college cases by the counselor to that of a statis¬ 
tical predictive equation (5). The problem could be further 
differentiated by comparing predictions made by the counselor on 
the basis of preliminary information, tests, questionnaire infor¬ 
mation and preliminary interview, with predictions after the first 
counseling interview. This would serve to determine the relative 
importance or validity of the information and impressions col¬ 
lected in the counseling interview with case data, such as test 
scores, available to the counselor before he confers with the 
student. Another differentiating study would involve having a 
case reader, who had no counseling relationship with the student, 
predict educational achievement on the basis of all the informa¬ 
tion available up to, but not including, the counseling interview 
itself, Such predictions maybe compared with those made by the 
counselor after he interviews the student, Such crucial experi¬ 
ments are needed; a preliminary one will soon be reported by 
the authors. 

There are two assumptions that may be applied here as a 
basis for evaluating the counseling program. One objective that 
may be assumed for a counseling program is that of enabling 
students to compensate successfully for their disabilities in order 
to succeed. If that is an objective, then the expected evidence 
of efficiency in the counseling program would be a lower prog¬ 
nostic efficiency of a test battery for counseled students than for 
non-counseled students. If counseling is effective in this SCHKe, 
then students who, if left alone, would fail, may succeed. 

Another objective of counseling may be to bring all factors 
other than those of aptitude (interest, opportunity, working con¬ 
ditions and so on) to a common level, Thus the performance of 
the students would be distributed according to their levels of 


16 



VOCATIONAL AND EDUCATIONAL COUNSELING 


ability. The greater the excess of predictive accuracy for coun¬ 
seled over non-counseled, the closer the counseling program will 
be presumed to have come to the ideal—that of removing all 
influences other than ability which interfere with student 
achievement. 

Composite Criteria. All of the criteria discussed above have 
been partial criteria, since none of them was assumed to be 
evaluating all the possible objectives of counseling. We turn 
now to possible methods by means of which a more comprehensive 
evaluation of counseling may be secured. 

The Use of a Judgment Criterion 

It is at this point that a clear schism appears between an 
approach which is narrowly statistical and an approach which 
makes use of statistical methods in conjunction with the experi¬ 
mental situation. The former point of view has the desirable 
objective of clear-cut results, but, in its blind adherence to tradi¬ 
tional method, produces results which are unlikely to be .significant 
either statistically or socially. This method would mechanically pool 
all of the part-criteria either in some form of average or in a profile. 
The method of averages compounds the artificiality which previ¬ 
ously had been indicated as inherent in the use of the part- 
criteria without reference to the individuality of each student, 
The method of profiles suffers from a lack of well developed 
statistical techniques for handling that type of data and, more 
seriously, from the fact that artificial data cannot be refined and 
validated by casting them into profile form. 

Rather than sacrifice meaningfulness for neatness of statis¬ 
tical treatment, the other approach has clearly recognized the 
impracticability, at the present, of getting more than rough 
measures of the general efficiency of counseling (33, 37). It has 
therefore attempted to use a judgment criterion by means of 
which the adjustment of the student is estimated in terms of his 
original problems and any of the available data, including the part 
criteria (16, 22, 27, 30, 31, 36, 38, 42, 43: chap. IX), 

As described in Williamson and Darley the judgment of ad¬ 
justment is based upon a follow-up interview of the student (43: 
chap, IX), The status of the case at the time of follow-up is 
always considered in the light of the diagnosis and prognosis 
made earlier by the counselor. All of the various types of data 
—grade achievement, the student’s statement of satisfaction and 
adjustment with regard to vocational orientation and choice, 

17 



EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


information concerning the student’s activities, judgment of gen¬ 
eral attitude, etc.—are weighed by the judge according to their 
relevance to the individual case. In this way the possible errors 
inherent in a non-personalistic interpretation of objective data 
are minimized. 

The simplest experimental design requires that either the 
counselor or a case reader make the judgment as to the degree of 
the student’s adjustment subsequent to counseling and in contrast 
with his pre-counseling adjustment. A detailed manual of direc¬ 
tions, including examples of degrees of adjustment, is necessary 
in such an experiment (43: chap. IX). The results are reported 
in terms of the percentage of students counseled who achieved 
various degrees of adjustment. If the counselor makes such a 
judgment, it should be pointed out that, if he is a good one, he 
will know subtle angles and attitudes which are unlikely to be 
explicitly stated in the case records and which would be over¬ 
looked by an independent case reader. Many of the subtle influ¬ 
ences in a case may even exist in unverbalized form for the coun¬ 
selor and have no possibility of appearing in the case record. 
In addition, the counselor knows, perhaps better than anyone 
else possibly can, what he has been trying to do. The disad¬ 
vantages of using the counselor’s judgment lie first of all in his 
special interest in the results which may lead to an approach 
which is either too self-critical or too self-lenient; and secondly, 
in the undesirable consequence that the counselor's effectiveness 
in counseling may be decreased because of his awareness of his 
responsibility for evaluating his own efforts. 

While the use of the independent case reader obviates the 


possibility of impairing the counselor’s effectiveness and intro¬ 
duces a theoretically impartial evaluator, it also has its draw¬ 
backs. As has been indicated, the case reader may miss many of 
the nuances. There are also so many conflicting philosophies and 
procedures and techniques in counseling that the case reader may 
be either unsympathetic with or ignorant of the counselor's spe¬ 
cific objectives. In order to achieve greater impartiality and objec¬ 
tivity, two case readers and an arbitrator have been used, Wil¬ 
liamson has reported the use of this method with three trained 
workers who had nothing to do with the diagnosis and' counsel- 
mg, but who collected data directly from the students for in¬ 
dependent and pooled judgments of effectiveness (42). With 
trained judges heterogeneity of point of view need not inter¬ 
fere with consistency in judgments. 


18 



VOCATIONAL AND EDUCATIONAL COUNSELING 

While the experimental method outlined may yield evidence 
of the degree of improvement in the counseled population, it 
does not prove that cooperation with the counselor’s suggestions 
was a necessary condition. Evidence for the latter may be ob¬ 
tained by estimating the degree of cooperation of each student by 
one of the methods previously discussed. By comparing the ad¬ 
justment achieved by students who cooperated, with the adjust- 
ment of those who did not, we can determine whether cooperation 
was necessary and to what degree. If we find that those who 
cooperate adjust better than those who do not, we still have a 
question of whether the degree of adjustment achieved by those 
who did not cooperate might not be equalled or bettered by those 
who received no counseling at all. A matched non-counseled 
group would seem to be the only means of providing an answer 
to this question. We have already discussed the possibilities of 
the matching process. If we were to attempt to avoid the 
matching problem by counseling every other student who comes 
for counseling, retaining the other half of the group as controls, 
we would be doing violence to a social canon* The real solution 
must await a time when we have sufficiently isolated treatment 
techniques and problems to compare two treatments used with 
the same type of counseling problem. 

General Considerations 

Our consideration of the types of criteria and the methods of 
measuring them, feasible in the evaluation of the effectiveness 
of counseling, has touched upon definite limitations on exact 
evaluation of counseling. Whether these weaknesses will be 
insurmountable and will restrict evaluation to rough, rule-of- 
thumb methods depends upon future progress in experimentation. 

One type of difficulty is the inability to set up clear delinea¬ 
tions of the problems and variables involved. This has been 
traced first to the inadequacy of descriptions of diagnostic and 
treatment techniques of the counselor plus the gaps in knowledge 
of student problems (40: chap. XXVII), A second source is 
the element of uniqueness in the student’s problems and the coun¬ 
seling techniques appropriate for them* The criteria which have 
been considered have the weakness of being either too gross a 
measurement or so far removed from the individual as to lack the 
quality of meaningfulness. If, in the future, methods are devised 
for providing more adequate criteria, then more exact experi¬ 
mentation may be made, 


19 



EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


A second limitation, inherent in the nature of the counseling 
situation, is the sampling difficulties involved in setting up con¬ 
trols. This difficulty has been stated by Murphy et a/,, “In con¬ 
nection with the control group, there should be noted the relative 
impossibility of securing a true 'control. 1 The fact that the ex¬ 
perimental group applied voluntarily for counseling introduces 
a selective factor unmatchable among non-clients’* (23: p. 952), 
Although this assumption has never been established even in the 
plethora oi learning experiments , only assumed, caution should 
be applied in interpreting the results of matching experiments. 
We also lack adequate techniques for matching students for such 
pertinent factors as interest, perseverence and other similar qual¬ 
ities. At the same time, it is impossible to set up an experiment 
which would entail selecting cases from the general population, 
since willingness to be counseled would seem to be one of the 
necessary conditions for counseling. These limitations in sam¬ 
pling methods imply that evaluation must necessarily be a long¬ 
time process, involving a great deal of experimentation with 
different methods. 


The condition that diagnosis and counseling cannot be 
studied separately is a further complicating factor. When the 
counselor has made a diagnosis of the student’s problems, its 
causes, and the types of treatments that are likely to solve it, 
he cannot determine whether his diagnosis was correct unless the 
student carries out the recommendations. For example, if a 
counselor's diagnosis states that student A can do effective work 
in college only by following certain of his recommendations, 
student A must remain in college for that diagnosis to be tested. 
The inability to control the conditions necessary for an adequate 
tryout of counseling recommendations often precludes determina¬ 
tion of the effectiveness of the advice. Factors which are often 
beyond the control of either the counselor or the student include 
restriction imposed by the school administration, those imposed 
by soaal codes, prejudices and attitudes of students or parents 
and lack of proper placement facilities, F 

aonUcation t for a r d meth0dS discussed in thia P a P« have little 

Jw T f hC com P anson of individualized counseling with 
other types e.g., group, traditional, casual interview etc This 

20 



VOCATIONAL AND EDUCATIONAL COUNSELING 

used by the counselor. Grade or information achievement repre¬ 
sents the only type of criterion that can be applied in evaluating 
this type of counseling. 

Our discussion has shown that there is a need for more sys¬ 
tematic studies, using the more feasible part-criteria. Other 
approaches have been indicated as having possibilities. The rela¬ 
tionship between the student’s educational and vocational objec¬ 
tives and his level of ability should be studied further as a 
criterion of adjustment. Some studies have already yielded some 
preliminary results (1, 10, 29, 32). In the last few years this 
problem has been receiving attention under the term level of 
aspiration. Studies have revealed some provocative principles 
under laboratory conditions (11, 12), but we must learn whether 
these principles have validity for life situations. We should 
determine whether success in one area, i,e., vocational or educa¬ 
tional, has an effect on the level of aspiration in other areas of 
adjustment. What are the relations between level of aspiration 
and feelings of failure? Can vocational or educational success 
be such a potent factor that it would outweigh other experiences 
in determining an individual’s general success-failure feelings? 
How vital are social group factors in determining the individ- 
ual’s level of aspiration? To what degree do levels of aspiration 
persevere at various age leveLs? The answers to these questions 
would seem to be pregnant with implications for both the coun¬ 
selor and the evaluator. 

If and when our knowledge of student problems and of diag¬ 
nostic and treatment techniques has advanced sufficiently, we will 
have the opportunity to carry out more exact investigations. At 
this point, we can foresee experimental designs which should 
be applicable when such advances are made. One possibility is 
an experiment in which individuals having problem A will be 
divided into two groups, one which will receive treatment 1, the 
other treatment 2, In this way we may determine which specific 
techniques are most effective for a particular problem. 

Another plan of experiment could be designed to determine 
for what types of problems a treatment is applicable, Here, two 
groups, one representing problem A, the other problem B, would 
both receive treatment 1. Both methods could be expanded to 
include all types of treatments and problems. Criticism of these 
designs may be directed at the apparent assumption that problems 
may appear isolatedly, That this is extremely unlikely cannot be 
denied, Yet, assuming advances in our techniques and proced- 


21 



EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 

ures, it seems possible that the types of factorial design used in 
analysis of variances can be utilized to take care of the effects 
of interactions among treatments for various problems. The 
success of such an experiment will depend upon the discovery of 
a number of cases in which one problem is clearly present and 
other types are minimal in significance or complexity. 

Summary and Conclusions 

1. All available methods of evaluation have weaknesses. 

2. Composite criteria which avoid arithmetic combination of 
the part-criteria are at present least open to question, although 
still being crude measures. 

3. The problem of securing sufficient data without doing 
violence to the concept and practice of counseling is a real one. 
Involved also are the inadequacy and incompleteness of most 
available case records. 

4. The proper time interval to use for evaluation is extremely 
important because of the possible relationship between the inter¬ 
vention of confusing factors and the length of time between 
counseling and evaluation. 

5. The methods used for validation of diagnostic and prog¬ 
nostic tools (e.g., tests) may not be applicable because of the 
uniqueness of each counseling situation. Stated another way, the 
methods of studying students in general may not be applied to 
the study of individual students with particular problems. 

6. An impediment to more exact evaluation is the inability to 
control conditions for an adequate test of counseling recommen¬ 
dations. 


BIBLIOGRAPHY 


1. Alberty, H. B. “The Permanence of Vocational Choices of High 
School Pupils,” Industrial Arts Magazine , XXIV (1925), 203-07. 

2. Beaumont, H. “The Evaluation of Academic Counseling. 5 ' Journal of 
Higher Education, X (1939), 79-82 

3 ' « el1, » Hugl1 M ' The Theoi y and Practice 0 / Personal Counseling. 
Palo Alto: Stanford University Press, 1939, 

4. Burt, Cyril & Others. A Study in Vocational Guidance. Industrial 

n^ Ue *r arch Board - Re P° rtNo ' 33 - London: H.R.H, Stationery 
Unice, 1926. 

5 ' RS5,5v(S?M°!o7 S “ d “‘ °< *•"'»«/ 

* Club G " ida “' Proec “” 0 *— 


22 



VOCATIONAL AND EDUCATIONAL COUNSELING 

7. Coler, C. S., Fitch, John A, Fitch, Florence Lee, Paterson, Donald G. 
General Appraisals ol the Adjustment Service. New York: American 
Association for Adult Education, 1935, 87 pages, 

8. Cowley, W. H. “An Experiment in Freshman Counseling.” Journal 
of Higher Education, IV (1933), 245-48, 

9. Earle, F. M. Methods of Choosing a Career. London: George C. 
Harrap & Company, Ltd,, 1931, 

10. Feingold, G, A, “The Relation Between Intelligence and Vocational 
Choices of High School Pupils,” Journal of Applied Psychology, 
VII (1923), 152. 

11. Frank, Jerome D. “Individual Diffeicnces in Certain Aspects of Level 
of Aspiration.” American Journal of Psychology , XLVII (1935), 
119-28, 

12. Frank, Jerome D. “Some Psychological Determinants of the Level of 
Aspiration” American Journal of Psychology, XLVII (1935), 285-93, 

13. Freeman, H. J. and Jones, L. “Final Report of the Long-Time Effect 
of Counseling Low-Percentile Freshmen." School and Society, 
XXXVIII (1933), 382-84. 

14. Hawkins, L. S and Fialkin, Harry N. Clients' Opinions of the Adjust¬ 
ment Service, New Yoik: American Association for Adult Education, 
1935, 95 pages 

15. Holaday, P. W, “The Long-Time Effect of Freshmen Counseling,” 
School and Society, XXIX (1929), 234-36. 

16. Jennings, J, R. and Stott, M. B. “A Fourth Follow-up of Vocationally 
Advised Cases." Human Factor (London), X (1936), 165-74. 

17. Kefauver, N. and Hand, H, C. Manual for Kcfauver-Hand Guidanxe 
Tests and Inventories. New York: World Book Company, 1937, 

18 Kirkpatrick, F. H, “Vocational Guidance in An American College/* 
Human Factor (London), XI (1937), 409-14. 

19. Leman, A. C< “An Expciimental Study of Guidance and Placement of 
Freshmen in the Lowest Decile of the Iowa Qualifying Examination, 
1925.” University of Iowa Studies in Education, III (1927), 8. Uni¬ 
versity of Iowa, 

20. Lund, S. E. Torsten. “The Personal Interview m High School Guid¬ 
ance” School Review, XXXIX (1931), 196-207, 

21. Lurie, W. A “Intra-Individual and Extra-Individual Factors Influ¬ 
encing the Levels of Vocational Aspiration and Achievement/* A 
paper read at the Forty-sixth Annual Meeting of the American Psy¬ 
chological Association, Columbus, Ohio, 1938, Abstract in Psycholog¬ 
ical Bulletin , XXXV (1938), 670. 

22. MacRae, A “A Follow-up of Vocationally Advised Cases.” Journal 
of the National Institute of Industrial Psychology, V (1931), 242-47, 

23. Muiphy, J. F„ Hall, O, M„ and Bergen, G, L. “Does Guidance Change 
Attitudes?” Occupations, XXIV (1936), 948-52. 

24. Newland, T. Ernest and Ackley, W. E, “An Experimental Study of 
the Effect of Educational Guidance on a Selected Group of High 
School Sophomores,” Journal of Experimental Education , V (1936), 
23-5. 

25. Oakley, C. A. “A First Follow-up of Scottish Vocationally Adviaed 
Cases,” Human Factor (London), XI (1937), 27-31. 

23 



EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 

26. Paterson, D. G. and Darley, J. G. Men, Women, and Jo hs. Minne- 
apolis: University of Minnesota Piess, 1936. 

27. Paterson, D. G. and Langlie, T Report of a Controlled Experiment 
on the Value of Faculty Advisers for Probation Students m f/ie Col¬ 
lege of Engineering, Chemistry and Architecture, University of Min¬ 
nesota, 1925-26. (Unpublished.) 

28. Remmers, H. H. “Measuring Attitudes Toward Vocations." Studies 
in Higher Education, Purdue University, XXXV (1934), 77-83. 

29 Remmers, H. H. and Whisler, L. D. “The Effects of a Guidance Pro¬ 
gram on Vocational Attitudes." Studies m Higher Education, Purdue 
University, XXXIV (1938), 68-82. 

30. Rodgers, T. A. “A Follow-up of Vocationally Advised Cases." Human 
Factor (London), XI (I937)j 16-26. 

31. Seipp, Emma. A Study of One Hundred Clients of the Adjustment 
Service. New York: American Association for Adult Education, 1935. 
30 pages. 

32. Sparling, E. J. Do College Students Choose Vocations Wisely? 
New York: Teachers College Contributions to Education, Columbia 
University, 1933. 

33. Stott, Mary B “Criteria Used in England/’ Occupations, XXIV 
(1936), 953-57. 

34. Stott, Mary B. “Occupational Success ” Occupational Psychology 
(London), XIII (1939), 126-40. 

35. Thorndike, E, L. Prediction of Vocational Success, New York: The 
Commonwealth Fund, 1934. 

36. Trabue, M R, and Dvorak, B. J. A Study of Needs of Adults for 
Further Training, Minneapolis: Univeisity of Minnesota PreKH, 1934, 

37. Viteles, M. S, “A Dynamic Criterion,” Occupations, XIV (1936), 
962-67. 

38. Viteles, M S. “Validating the Clinical Method in Vocational Guid¬ 
ance/’ Psychological Chnic, XVIII (1929), 69-77, 

39. Williamson, E. G. “Faculty Counseling at Minnesota. An Evaluation 
Study of Social Case Work Methods/’ Occupations, XIV (1936), 
426-33. 


40. Williamson, E. G. How to Counsel Students . New York: McGtaw- 
Hill, Inc,, 1939. 


41. Williamson, E. G, “The Role of Faculty Counseling in Scholastic 
Motivation.” Journal of Applied Psychology , XIX (1936), 314-24. 

42 Williamson, E. G. A Summary of Studies in the Evaluation of Guid¬ 
ance. Report of the Fifteenth Annual Meeting of the American Col¬ 
lege Personnel Association, 1938, Pp. 73 - 7 , 

43. Williamson, E. G. and Darley, J. G. Student Personnel Work. New 
York: McGraw-Hill, Inc., 1937. 


44. Wrenn, C. G. Recent Research on Counseling . Report of the Six- 
teenth Annual Meeting of the American College Personnel Associa^ 
tion. Cleveland, Ohio, 1939. Pp. 88-94. 


24 



THE LOGIC OF AGE SCALES* 

M. W. RICHARDSON 
United States Civil Service Commission 

An age scale is a type of psychological test designed to 
measure general mental ability ("intelligence") in terms of 
performance of various mental tasks found to be normal for 
various ages. The child whose performance is typical of ten- 
year-old children, for example, is said to have a mental age of 
ten. The method was invented by Alfred Binet. Although the 
techniques have been modified in detail by British and American 
psychologists and the device of the intelligence quotient (I.Q.) 
has been appended, the main outlines of Binet’s work have been 
retained. The most widely used of the age scales is the Stan¬ 
ford Binet. 

The age scale has been widely accepted. The I.Q., in particu¬ 
lar, has passed into the language of the general public, together 
with the common misconceptions connected with its brief history 
in science. The fact that a device has attained wide use is not a 
guarantee of its soundness; and it is sometimes necessary in the 
interest of sound scientific advance to examine critically proce¬ 
dures and devices in common use. If the criticisms in this paper 
seem to be directed chiefly to one particular age scale, the expla¬ 
nation is that this one scale is the most widely used and has 
been most carefully constructed and standardized, 

A person whose academic specialty is the logic of science 
addressed a group of psychologists on the necessary conditions of 
measurement. He discussed the familiar matter of equality of 
units and the operational test of equality of units by the coinci¬ 
dence of any part of the scale with any other upon superimposi- 
tion. He mentioned the matter of measuring a single variable at 
a time, and the necessity of having a real origin of measurement, 
if ratio comparisons are to be made. He followed this sound 
discussion of the scientific method with a curiously erroneous 
one; he congratulated psychologists in having, in the Binet age 
scales, a measuring device that meets all three of the requirements 
for a scientific measuring device. The writer and others carefully 

* This article is adapted from a chapter in a forthcoming book on test 
theory by the same author. 


25 



EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


pointed out to the speaker that the age scale meets not one of the 
three requirements set up; it has no real origin of measurement; 
its units are not equal, and it does not isolate a single, unitary 
variable for measurement. 

When it is uncritically considered, the age scale idea seems 
to be a happy one. What could be more simple and direct than 
to see how high the child can ascend an evenly graded scale? 
The concept of normality of performance made it the most 
natural thing in the world to describe such test performance in 
terms of the age for which the performance was normal. The 
concept of the age scale is a deceptively simple one, however, and 
the writer is of the opinion that the unstated special requirements 
and limitations of the age technique have received too little atten¬ 
tion. It is true that during the twenty-one years between 1916 and 
1937 many papers pointing to difficulties in scaling, scoring, and 
interpreting the Stanford Binet appeared. Moreover, several 
issues were kept continually in the foreground of attention. 
Unfortunately, the distinction between purely psychometric issues 
and psychological issues was not always made. The result is that 
certain deficiencies and limitations belonging to the mechanics 
of test construction were misinterpreted as psychological issues, 
A case in point is the constancy of the I.Q., about which it will be 
necessary to say more later. 


Validity of Age Scales 

A Binet scale consists of a series of sub-tests or items designed 
to measure “general intelligence,” whatever that may mean. For 
example, the 1937 edition of the Stanford Binet contains 127 sub¬ 
tests graded in difficulty from tests suitable for two-year-olds to 
those suitable for superior adults. The sub-tests were selected in 
the process of construction from a larger number of sub-tests 
It is pertinent to inquire into the method of selection of the sub¬ 
tests. In what respect does the method of selecting the sub-tests 
insure that the resulting scale will be valid? One of the devices is 
o plot the percentage of correct responses to any given sub-test 
against the chronological age, after the sub-test hasbeen apnlied 

iLtMfof'thJ child”” ° £v "‘°“ s ag ' gr °“P ! - The at which 

aflhe cm T ,S Uke " “ the “alc-pocitio,, 


26 



THE LOGIC OF AGE SCALES 

of the Binet tests many compromises with practical expediency 
are made. Under special conditions a mathematical function can 
be used to describe a discrimination curve of items against chron¬ 
ological age. It has been shown that the use of this function is 
simply an alternative to the correlation methods, and precisely 
equivalent to them under certain special conditions, 

This type of analysis throws light on the “validating’' proce¬ 
dure, The retention of sub-tests on the basis of sharp curves of 
age discrimination is the same as retaining items that have the 
large correlations with chronological age. The criterion o£ valid¬ 
ity is simply chronological age, and the practical effect of the 
procedure is to select items that have relatively high correlations 
with chronological age. 

The procedure leads to a serious difficulty, The standing high 
jump, or other athletic skills, yield similar discrimination func¬ 
tions, since they are likewise positively correlated with chrono¬ 
logical age. The method of item selection thus breaks down as a 
way of attaining validity. The only criterion of validity remain¬ 
ing is the judgment of the persons constructing the scale. An 
allied consideration is that the selection of items on the basis of 
high correlation with mental age on the same or previous scale, 
is merely a measure of internal consistency or reliability, Nothing 
in the general procedure operates towards the selection of items 
that measure a unique trait. An interesting logical difficulty 
appears. Suppose that, out of the hand-picked collection of items 
supposedly measuring the aspects of intelligence desired in the 
scale, the items selected are those which have the steepest dis¬ 
crimination functions. Let us assume further that two items are 
so discriminating and so far apart in proper age-location that 
their discrimination functions do not overlap. The result is that 
the two hypothetical “good” items or sub-tests have a zero corre¬ 
lation, A scale made up of such items must necessarily be unreli¬ 
able as a composite measure, Furthermore, to the extent to which 
the search for valid sub-tests by this procedure should be success¬ 
ful, the number of different factors measured would increase. 
Evidence at present suggests that no fewer than six different 
mental functions are measured in a higgledy-piggledy fashion 
by the Stanford Binet* The multiplicity of factors is perhaps not 
so serious as the fact that different things arc measured at differ¬ 
ent ages, The sobering fact about the age-scale technique is that 
we do not know what is being measured, or what any given intelli- 


27 



EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 

gen.ce quotient means in terms of the relative standing - of the 
individual. 

As an added complication, the score received by the testee is 
expressed in terms of mental age, Mental Qgcs are statistical 
numbers based on the concept of normal or average test perform¬ 
ance of children of a given age, when the sample of children is 
representative of some population of children. On page 25 of 
Measuring Intelligence , Terman and Merrill state that the expres¬ 
sion of a test result in terms of age noims rests upon no statistical 
assumptions. The statement is erroneous and misleading, The 
truth of the matter is that the mental age is a measure derived 
from raw scores in accordance with certain assumptions; it is as 
definitely statistical in nature as the standard score, for example. 

In using scales of the Binet type, we choose to express test 
performance, not in the arbitrary units of number of items passed, 
but in terms of “mental years and months. 11 The raw score “units' 1 
are of course arbitrary, in the sense that they are not units at all, 
The child who answers 12 items correctly cannot be said to exceed 
the child who passes 9 by the same amount that the latter sur¬ 
passes a third child who answers 6 correctly. No ordinary test 
can be expected to satisfy the additive property required for 
measurement on a scale. But the mental year or the mental month 
is likewise not a real unit of measurement. In order for the 
mental year (or month) to be a real unit of measurement, it 
would be necessary for the function representing mental growth 
to increase regularly with chronological age. If, during each 
year, a child had the same increment of mental growth, the men¬ 
tal year or mental month would be constant in value. However, 
it is commonly agreed that the child matures less and less rapidly 
as he grows older, in intelligence as well as in physical character¬ 
istics. The annual increment of “intelligence,he., a mental year, 
steadily becomes less until mental maturity is reached, at which 
time it is zero. An age scale is, therefore, not a true scale because 
it is not built up from equal units. In this connection, it may be 
noted that the true shape of the mental growth curve cannot be 
determined from scores expressed in terms of mental ages* If 
such were attempted, one would get results predetermined by 
the crude growth curve adopted in order to express raw scores 
in terms of mental ages. 

Whatever the merits of the assumed growth curve may be, the 
crucial consideration is that its true shape and its upper limit 
cannot be determined by use of a “scale 31 expressed in terms of 


28 



THE LOGIC OF AGE SCALES 

mental ages. The true shape of the mental growth function can 
be determined, strictly speaking, only by use of a scale with real 
units of measurement. Once the mental growth function is estab¬ 
lished, it is possible to calibrate the underlying true scale in 
terms of mental ages, if desired. Then the interval between the 
mental age of four and the mental age of five might be expressed 
as a certain fraction of the scalar unit, the mental year between 
five and six as a somewhat smaller fraction of the same real unit, 
etc. Finally a place would be reached where the mental year is 
a negligible fraction of the real unit, and therefore has a value 
of practically zero. We would then have a proper (although 
indirect) experimental solution of the problem of the limit of 
mental maturity. The widespread use of mental ages has not 
helped to solve the problem, mainly because the use of mental 
ages as derived measures begs the question* 

Although the exclusive use of mental ages forever begs the 
question of the limits of mental maturity, it is urged that a 
definite and well-accepted social meaning has been attached to 
them. It seems simple enough to define mental age as the average 
or median test performance of typical nine-year-old children. 
The definition works well enough until the limit of mental matur¬ 
ity is reached. If the limit of maturity is assumed to be 15, a 
mental age of more than 15 is impossible, by definition. Mental 
ages of more than 15 arc assigned in the process of standardiza¬ 
tion to test performances by use of (l cut-and-try u procedures 
based on some not well-defined assumptions. At best it is unfor¬ 
tunate that the definition of mental age must be radically shifted 
at one point or region in the age scale. 

The L Q . and Its Troubles 

To multiply confusion, the device known as the intelligence 
quotient has been adopted. The intelligence quotient is defined as 
100 times the ratio of mental age to chronological age, and is thus 
an index of brightness. An index of brightness can of course 
be no more than a statistic relating the individual's test perform¬ 
ance to the average performance of those of the same age. It is 
exactly as true of I.Q, as it is of other possible statistiCsS serving 
the same purpose that one must always use it in connection with 
some measure of variability of test performance within the age 
group. Obviously one measure taken from a distribution has no 
meaning unless a measure of dispersion is given. It might be 
argued that clinicians keep in mind some kind of subjective scale 


29 



al and psychological measurement 

which serves in lieu of the ordinary statistical parameters. If so, 
the mental feat is remarkable since the various age groups, in one 
age scale at least, have dispersions of intelligence quotients which 
vary considerably. The difficulty of interpretation of the single 
statistic (the I.Q.) is greatly increased where the standard devi¬ 
ation of I.Q.’s of one chronological age group may be twice as 
large as that of another age group. The intelligence quotient 
shares with the mental age from which it is derived the fictitious 
character of measures above the maturity level. It is nonsense 
to describe an adult as having an I.Q. of 120 because such a state¬ 
ment is based on an irrational definition and is unverifiable exper¬ 
imentally. It is the practice of Binet testers to assume some defi¬ 
nite chronological age as the upper limit of mental growth. Thus, 
it is assumed on at least two age scales that the upper limit of 
the average child is reached at the age of fifteen. The crucial 
difficulty in making such an assumption is not that the upper 
level set may be wrong, but it lies in the utter impossibility of 
checking up on its correctness by means of age scales. 

One of the moot questions about the I.Q. is its constancy. 
It seems unfortunate to the writer that so much time of psycholo¬ 
gists has been wasted on such a matter. It seems that what ought 
rightly to be merely a formal problem in test construction has 
been translated into one of spurious psychological significance. 
The only question properly asked at this time may be definitely 
stated: Did the authors of the age scale succeed in constructing 
a device which gives a constant I.Q.? Questions involving changes 
in I.Q. possibly attributable to environmental factors must always 
take into account the fluctuations of the I.Q. which are due to the 
test and to the statistical operations used to determine the intelli¬ 
gence quotient. Failure to consider the expected magnitude of 
fluctuation of the I.Q. may easily result in gross misinterpreta¬ 
tions of time-changes in its value for any one individual. 

Before we can properly evaluate the effect of organic and 
environmental factors on the intelligence quotient, we are forced 
to consider the variations in the I.Q. inherent in the testing 
technique.. The fluctuations are those associated with the concept 
of reliability. For the 1937 Revision of the Stanford-Binet Scale 
the estimated reliability coefficient varies from 0.90 to 0.98, the 
higher reliability being associated with the lower I.Q, intervals 
A median value for those near 100 I.Q. i s 0 .92. A representative 
value of the standard error of measurement is 4.5. Certain sys¬ 
tematic variations in the individual I.Q. are also found. The 

30 



THE LOGIC OF AGE SCALES 

practice effect is one type of systematic variation. Terman and 
Merrill estimate that the mean increase in I.Q. on the second test 
(which means on the other form, since two forms, L and M, are 
provided) ranges from 2. to 4,4, when the time interval between 
testings is short. The increase due to practice effect is presum¬ 
ably greater when the same form is repeated. 

Another systematic source of error may lie in details of the 
construction of the age scale. Thus, it might be a characteristic 
of certain age scales that the I.Q, of a superior child decreases 
with chronological age. Strictly speaking, nothing in the defini¬ 
tion of the I.Q. requires constancy during the entire period of 
development. The constancy of the I.Q., if it exists, is imposed 
by the process of standardization. Therefore, an experimentally 
obtained constancy of the I.Q, proves only that the scale has been 
constructed in such fashion as to produce constant I.Q/s except, 
of course, that random fluctuations will still be present. When, 
and only when, an age scale has the characteristic that I.Q/s of 
individuals at all age levels tend to remain constant, one may 
attach significance to the case of the unusual individual whose 
I.Q. does not remain constant. The significance of any such shift 
of I.Q. in time must be judged in relation to the normal shift to 
be expected by random error, or unreliability. 

Increases or decreases of the order of magnitude of three 
times the standard error of measurement must first be tested with 
respect to the magnitude of variable errors of measurement before 
it is legitimate to entertain the hypothesis that some other factor 
such as special therapy, change in environment, or organic change 
is responsible for the shift in I.Q. In addition, obtained increases 
should be scrutinized carefully from the standpoint of possible 
practice effect. All such interpretation is predicated on the basis 
of constant I.Q., as built into the scale itself. One may properly 
inquire just how a scale with constant I.Q, may be constructed* 
In view of the inherent difficulties with the mental age and intel¬ 
ligence quotient, it is impossible to state any perfectly general 
rules. However, the part of the scale which is treated as if the 
growth curve were linear, viz, two to 13 years, can be abstracted 
for discussion. The problem the test constructor faces is that of 
providing that most individuals will be assigned the same I.Q,, 
within the reliability of the scale, every year from two to 12 
inclusive. If the various sub-tests are properly scaled, he,, 
assigned to a given year level as a median performance of unse¬ 
lected children of that age, the I.Q. of 100 will remain constant 


31 



EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 

Inconstancy is to be expected in intelligence quotients other 
than 100, unless certain conditions are satisfied. One condition is 
that the variability of test performance increases for successive 
age groups from two to 12. If we assume that children of a 
given typical group vary more among themselves as they grow 
older, it is possible to arrange matters so that the I.Q. of most 
individuals will be constant. Suppose that we have an individual 
whose I.Q. at age eight is 120. The mental age from which the 
I.Q. is estimated is 9.6 years (or 9 years, 7 months, approxi¬ 
mately). Let us further suppose that we have a distribution of 
mental ages of all the children in our sample of eight-year-olds. 
The mental age of 9.6 years is, say, one standard deviation above 
the mean of the distribution of eight-year-olds. The standard 
deviation of the distribution is 9.6 — 8 = 1.6 mental years. 

Now, if the same child is to have an I.Q, of 120 at the age of 
nine, his mental age will then be 10.8 years. If the nine-year-okl 
sample is composed of the same children we should expect, except 
for errors in measurement, that the child will have the same posi¬ 
tion in the nine-year distribution that he had in the distribution 
of eight-year-olds. Since his mental age is now 10.8, the standard 
deviation of the distribution of nine-year-olds is 1.8 mental year. 
Similarly, the standard deviation of the ten-year-olds must be 
2.0 mental years; of eleven-year-olds, 2.2 mental years; etc. The 
preceding illustration shows that, for an assumed linear growth 
function, the standard deviations of the mental ages must have 
constant increments for each advancing year. How shall this he 
done? Considering, for sake of simplicity, that we have just six 
sub-tests at each year level, we may increase the standard devi¬ 
ations of successive year levels by (a) selecting sub-tests which 
have higher mtercorrelations at the older age levels, (b) assign¬ 
ing a larger number of mental months to each sub-test. It will be 
seen at once that the latter is inadmissible since a total of 12 
mental months is assigned at each year level. The conclusion is 
inescapable that the degree of correlation between sub-tests must 

increase steadily with higher age levels if the LQ. is to be 
constant. v 

fore &°ing treatment is theoretical and does not imply 
that the authors of anv l • IX 

“i—- * o' 'v. r s „ h r a ~. s ^rSb^ 

^ measurement 

32 



THE LOGIC OF AGE SCALES 

dictable fashion; it may be sufficient to account for the relative 
constancy of I.Q.’s attained in age scales. 

It is difficult, however, to account for the attainment of con¬ 
stancy of I.Q, and, at the same time, approximately equal disper¬ 
sion at the various age levels. The range of standard deviations 
of I.Q.’s reported for various half-year levels is from 12,5 to 20.7, 
A representative value is 17. The values vary considei ably, prob¬ 
ably because of accidents of scale construction. Certainly the 
values given by Terman and Merrill do not vary systematically 
with age, and the authors assume that the true variability is 
nearly constant from age to age. However, let us consider a 
half-year group as an approximation to "point” age. If all indi¬ 
viduals within such a half-year group are considered to be of the 
same chronological age, the mental ages are proportional to the 
intelligence quotients, i.e,, a plot of M.A, against I.Q, is linear. 
Even for a half-year interval, approximate linearity must hold; 
otherwise the definition of an LQ. is meaningless. 

It follows that if half-year groups have the same I.Q, disper¬ 
sion, they must have approximately the same mental age disper¬ 
sion, But the mental age dispersions must increase from year to 
year in order for I.Q.’s of individuals to be constant, It thus 
appears that two possible properties of the I,Q, are inconsistent 
and not attainable at the same time, in any strict sense. The most 
serious criticism to be directed against Terman and Merrill’s dis¬ 
cussion of the matter is that they tend to treat the (roughly) 
approximate equality of dispersion of I.Q. at the various age 
levels as experimental facts, as perhaps having psychological sig¬ 
nificance. The I.Q. is a statistical concept, having the properties 
we put into it by the accidents of cut-and-try scale construction 
or which we force it to have by conscious devsign, If we postulate 
that our .statistical index shall have certain properties, we 
can then construct a test in accordance with our imposed 
requirements. 

The only reservation is that we may possibly have imposed 
characteristics which are mutually inconsistent, in which case 
we perforce discover the source of the difficulty, If we fail to 
limit the properties of a statistic by rational design, the vagaries 
of that statistic will be brought to light in subsequent empirical 
studies. The result is the raising of such false issues as the 
constancy of the I.Q. The gist of the matter is that the I.Q. 
can be made to be constant, if that is thought to be a desirable 
property. If the scale is not constructed in such a way as to give 


33 



EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 

constant I.Q.’s for (most) individuals, the die is cast and the 
I.Q. will not be constant. 

In. summary, it may be stated that the age scale technique ~ 

(1) possesses no advantage over group test methods 

(2) has no straightforward rationale so that the process 
of standardization may proceed without the necessity 
for “adjustments” 

(3) meets none of the three requirements of real mental 
measurement 

(4) leads to much useless work in correcting the previous 
“standardization” 

(5) embodies the mental age and intelligence quotient, both 
extremely unfortunate concepts 

(6) leads to problems of spurious psychological signifi¬ 
cance, such as the constancy of the I.Q, 

(7) makes impossible any solution of the mental growth 
function 

(8) has led to dubious devices and untenable interpreta¬ 
tions of various sorts, among them “scatter” and meas¬ 
ure of “mental deterioration” 

It is recommended that the age scale technique in its present 
form be abolished in its entirety, and that it be supplanted by 
reliable homogeneous group tests of single functions. The latter 
can be recombined, if desired, into a single index of mental 
capacity based on position in year group. A better procedure is 
to continue work towards some real unit of measurement, to the 
end that departures from normal growth in several functions may 
be discovered and clinically interpreted. If, by reason of demand 
from teacher, parent, or psychiatrist it will seem necessary to 
give a general index of (average) mental level reached, it can be 
done by use of a suitable combination of measures furnished by 
the separate tests. It is desirable, however, to avoid the use of a 
single index of mental level. 

It has been urged in defense of the Binet test that during its 
administration, the trained clinical psychologist has an opportu¬ 
nity to make observations of the child’s behavior other than that 
required for rating “general intelligence.” It is maintained that 
such observations may have as much value as, or more value than, 
the mental age, in getting a “clear picture of the individual 
tested. If such clinical insights can be reliably obtained and 
recorded, the obvious desideratum is a standardized interviewing 
technique, to be applied and interpreted entirely separately from 
the measures of primary abilities. 


34 



COUNSELING ON THE BASIS OF INTEREST 
MEASUREMENT* 

JOHN G, DARLEY 
University of Minnesota 

As the counselor studies his available data on abilities, 
achievement, interests, personality, and background of the stu¬ 
dent facing him in the interview, he must select a conversational 
starting point that will establish rapport and get the interview 
under way, At some early time he must discuss the student's 
stated reason for seeking help, and eventually he must interpret 
the interest test data in a manner understandable to the student. 
Assume that the student makes A scores on the occupational keys 
for Y.M.C.A. secretary, and personnel manager, and B 1 for 
school superintendent and social science teacher on the Strong 
Vocational Interest Blank, Assume that his claimed occupational 
choices are business, engineering, and "executive work.” He feels 
the need of help in making a final occupational choice. 

At the point of interest test interpretation, the counselor can 
make this bald statement: “You have the interests of a Y.M.C.A. 
secretary or a personnel manager!” With minor modifications 
this is probably the standard approach to interpretation. There 
is no more probable way to lose a case than this. It is the least 
effective clinical approach, for the following reasons: 

1. The student’s spoken or unspoken response is usually 
“How can you say that? I never was a Y.M.C.A. secretary 
or a personnel manager!” At this point the counselor must 
backtrack and start a rather incoherent explanation of the 
basis of interest measurement, to his own and the student's 
confusion. 

2. If the student accepts the statement without raising 
the foregoing issue in some form, the chances are he will 
re-interpret the statement, then or later, to mean that he 
has the ability to be a Y.M.C.A. secretary or a personnel 
manager, and that these arc two jobs where hia success is 
guaranteed. If any other factors interfere with curricular 

*This article is the first draTt of a chapter in a forthcoming mono¬ 
graph entitled: Clinical Aspects and Interpretation of the Strong Voca¬ 
tional Interest Blanks. Other tbcoictical am! interpretive phases of inter¬ 
est measurement are treated more extensively in the monograph, to he 
published by the Psychological Corporation. 


35 



EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


or job success, he claims he “was told he would succeed in 
these jobs.” 

3. Such statements run the risk of flouting countless 
stereotypes, prejudices, specific dislikes, or misconceptions 
evoked by occupational labels, either in the student or in 
his parents. Very few people know what a personnel man¬ 
ager does, and there are substantial and not always com¬ 
plimentary stereotypes about Y.M.C.A. workers. The in¬ 
teresting fact that these labels are directly at variance with 
the student’s claimed choices operates also to set up re¬ 
sistances in the student, although the student’s own specific 
choices are also hedged around with favorable stereotypes 
which may be equally invalid, and although there is con¬ 
siderable evidence on the instability and invalidity of the 
student’s claimed choices. 


4. Such statements run the risk of moving the discus¬ 
sion too early in counseling to the temporarily irrelevant 
factors of opportunities, salaries, prestige values. The 
counselor is forced to waste precious time giving data (if 
he knows of any) on these points before having established 
an understanding of the interest type being discussed* 

5, Such statements fail to take into account the vital 
factors of levels of ability and past achievement, which 
determine the level of future academic achievement most 
probably attainable; educational disabilities affecting edu¬ 
cational progress in a coriect curricular and occupational 
area; amounts of relevant specific aptitudes, in addition to 
level of general scholastic ability; and personality char¬ 
acteristics related to job success or satisfaction. Specific 
patterns of interest unaccompanied by ability and past 
achievement sufficient to permit curricular competition in 
professional schools occur frequently in counseling, be¬ 
cause of the relatively low general correlations between 
measured interests and measured abilities or achievement. 

trong has published correlations of each occupational 
key with an intelligence test. 1 On the original blank the 

RrinH^ er ?i 0t ?!f- 0ns range from —- 36 t0 - 38 ‘ Segel and 
Bnntle” collected interest test scores, college grades and 

?? S + COreS fr r m 100 junior col ' e £ eS freshmen. 
.. ® n S interest test scores for the keys for doctor, lawyer 

a«nt 1S thev C fm^ Sma ^ personnel Manager, and purchasing 
a gent, they found only one positive correlation above 

with selected parts of the Iowa High School Content Ex¬ 
amination—the correlation between engineering interests 

»•> «■ 

est Scores as 2 MMsured L by B the I St'* The , Rl ; latiQn ° f Occupational Into,-- 
Test Results and College Marks i n ”1 to Achievement 

/.area, o f 


36 



COUNSELING ON BASIS OF INTEREST MEASUREMENT 

and measured achievement in mathematics. Achievement 
in mathematics and science correlated .28 and .29, respec¬ 
tively with measured interests in medicine. Achievement 
m English literature, science, and social studies correlated 
—.43, —.26, and —.26 respectively with measured interests 
of a purchasing agent. The correlations between subject 
matter grades and measured interests weie even lower than 
those between achievement tests and measured interests. 
Grades in mathematics and science correlated only to the 
extent of .14 with interests in engineering, while grades in 
history correlated —.47 with interests in engineering. The 
authors were sufficiently encouraged by these relations be¬ 
tween scholastic accomplishment and interest test scores 
derived from studying adult occupational groups to suggest 
that “scales for scoring the Strong Interest Tests should 
be devised for the principal subject groups in higher sec¬ 
ondary education.” However, the obtained correlations were 
so low that the clinician must be extremely careful to keep 
interests and abilities or achievement separate in his own 
thinking, and to see that there is no such confusion in the 
student’s thinking. 

This error in counseling is particularly tragic and in¬ 
excusable where the occupations being discussed in terms 
of the interest test are those for which society demands 
college training prior to certification for professional com¬ 
petition. It is equally inexcusable in cases where the occu¬ 
pation can be entered with or without specific advanced 
training, as in the case of general measured interests in 
business. But in such cases, the counselor can cover his 
error by saying later what he should have explained earlier, 
namely, that in such occupations, success or satisfaction 
in the occupation is still possible even though success or 
satisfaction is not possible in a curriculum which may bear 
some degree of resemblance to the occupation, but which 
is not yet an indispensable prerequisite. This explanatory 
technique can be effectively used in “downgrading 1 ’ some 
cases. 

6. Such statements also fail to take into account the 
problem mentioned earlier' 1 m regard to the present-day 
representativeness of norm groups, as exemplified in the 
psychologists’ key. 

7. Finally, such blunt statements omit consideration of 
possible changes of specific measured interests which, while 
infrequent, may occur under certain conditions. Strong 
states this position clearly: “Prognostication of future be¬ 
havior cannot safely be based upon the presence or absence 
of any single interest, but it does appear that to a consid- 

The monograph from which this chapter is taken discusses the repre¬ 
sentativeness of Strong’s standardizing groups as a factor in interpreta¬ 
tion of the interest test results. 


37 



EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


erable degree at least it can be based upon the entire con¬ 
stellation of interests.”* 1 In the article quoted, test-retest 
correlations of the specific keys ranged from ,59 to .84 over 
a five-year interval beginning with the senior year in col¬ 
lege. 

Furthermore, in using the blank with younger students, 
it is usually more important to determine the interest typo 
than the specific occupational interest* Carter, Pyles, and 
Bretnall 0 have demonstrated the presence of the types at 
the average age of 16.5, whereas Carter and Jones 14 have 
shown that only 17 per cent of tenth-grade students receive 
specific A scores on the keys appropriate to their occupa¬ 
tional choices. Thus the counselor who uses the test with 
younger cases must remember that the standardizing pro¬ 
cedure based on levels of scores made by adults may not 
yield an A score to a high-school student on a key within 
the interest type in which he may have a legitimate and 
dominant pattern. 

With this understood, the test becomes clinically useful 
in the age range from about 15 yeais and up. But the 
counselor who looks only for single A scores cannot make 
effective use of the test in this age range. This difficulty 
would be clearly eliminated if a technique such as standard 
scores could be used as the reporting device for younger 
cases. Then the higher pattern of standard scores within 
an occupational group would stand out more clearly on the 
individual’s profile, where the letter grade scores, based on 
adult norms, do not show intra-individual patterns so 
clearly in younger cases. 

These statements of the ineffective way to interpret interest 
test scores, and the reasons therefore, grow out of bitter clinical 
experience. There is fortunately a more effective alternative. 
Suppose, in this hypothetical case, no reference is made to the in¬ 
terest test scores until late in the counseling interview. Suppose, 
further, that the counselor draws out of the student, by question¬ 
ing, the reasons behind the student’s own choices of business, 
engineering, or executive work.” He will discover much super¬ 
ficial thinking about jobs, which is in itself important. But he will 
also discover the specific factors leading to the choices: infor¬ 
mation (or misinformation) regarding salary scales and “over- 


. f Car ^ r ’ K ' Pyles and E ‘ Pl Bretnall, “A Comparative Study 


38 



COUNSELING ON BASIS OF INTEREST MEASUREMENT 


crowded” or “undercrowded” fields and job duties; satisfaction 
expected from the job; self-estimates of strong and weak abilities 
or subject-matter fields; evidences of family pressures or tradi¬ 
tions dictating the choices; self-estimates of aspirations and mo¬ 
tives that are operative in the choices; and evidences of out-of¬ 
school experiences shaping the choices. 

Suppose, finally, that the counselor is familiar with the “in¬ 
terest types” or “interest patterns” growing out of factor analysis 
studies. 7 The counselor can then direct the questioning at get¬ 
ting the student to evaluate activities which are related to the 
interest type and which are within the scope of his experience 
with his environment. Questions can also be used to evaluate 
those experiences contra-indicating the type into which the stu¬ 
dent's claimed choices fall. 

Specifically, in the hypothetical case, unhappy experiences 
with mathematics would contra-indicate the technological inter¬ 
est type, in which the claimed choice of engineering is included. 
Participation in Hi-Y work and summer camp jobs may be drawn 
out as bits of evidence in favor of the welfare or uplift type in 
which some of the measured interests fall. A discussion of “exec¬ 
utive work” as a pervasive problem of dealing with people takes 
it out of the claimed realm of a business activity alone. 

Notice that the student has not yet been informed of his own 
specific measured interests. Notice also that the counselor has 
used the test scores in directing his questions to evoke relevant 
experiences and to clarify the student’s thinking about jobs. At 
or near this point, the counselor will be ready to tell the student 
what his basic interest type seems to be, with some chance of 
getting this idea across by saying: “It seems to me that your 
basic interests are in helping people or in working with them 

^Available factor analysis studies establish the qualitatively different 
types of interest patterns somewhat as follows: interest in scientific or 
technological activities; interest m verbal or linguistic activities; interest 
in business contact activities; interest in business detail activities; inter¬ 
est in welfare or uplift activities. The specific occupational keys for 
which the men's interest test is scored may be approximately grouped m 
these five categories, To make a clinical determination of the intensity of 
the interest type, the following procedure has been used with experimental 
verification, including tabulations of frequency of occurrence: for an indi¬ 
vidual student, the primary interest pattern is the interest type within 
which he shows a preponderance (majority or plurality) of A and B + 
scores on the specific occupational keys; the secondary interest pattern is 
the interest type within which he shows a preponderance of B-|- and B 
scores; and the tertiary interest pattern is the interest type within which 
he shows a preponderance of B and B — scores on the specific keys. 


39 



EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 

in an effort to bring about an improved adjustment, rather than 
in technical, impersonal activities, or in piling up a tremendous 
fortune. 55 Then he may discuss specific occupational duties and 
labels as representatives of the basic type, phrasing his remarks 
somewhat as follows: “These basic interests in helping people 
(or working with people) would be satisfied by the job of a per¬ 
sonnel manager, for example, who is responsible for . . , ” {and 
then may follow a description of job duties and responsibilities 
and types of training) . * . . ; or those same interests would find 
an outlet in the type of work that a Y. secretary might do. So 
far as training is concerned, these two jobs require somewhat 
different types of abilities and aptitudes as we can see in studying 
the two curricula involved; therefore, it is important to see how 
your abilities and past achievements line up with the two 
choices .... 55 

In this way the A and B-\- scores are introduced as examples 
of occupational outlets for the interest type rather than rigid 
occupational prescriptions for this student, and due allowance 
can be made for existing curricular differences. 

The advantages of this clinical procedure are obvious. It 
reduces to a minimum the arousal of resistances growing out of 
stereotypes or prejudices which the student may have about the 
occupational label, It permits the counselor, subject to his own 
imagination and knowledge of jobs, to generalize beyond the 
available keys on the blank and classify oilier occupations within 
the basic interest type, which is valuable when one realizes that 
there are about 20,000 occupational labels and only 36 occupational 
keys on the revised blank for men, and 17 occupational keys on 
the blank for women, It permits the counselor then to discuss 
levels of ability, achievement, and aptitude required for a wider 
range of jobs within the interest type, and thus it permits read¬ 
justments of the student's plans in the light of other pertinent 
data about him. It gives the student a clearer understanding of 
the place of interests in making a vocational choice, because the 
counselor can explicate the student's responses to his earlier 
questions as they relate to an interest type theory* It reduces 
to a minimum any conflict between the student’s specific choices 
and the counselor’s alternative suggestions, since both the spe¬ 
cific choices and the alternative suggestions are assigned to 
broader categories of interest types, where the student can more 
easily see his own status in regard to types of occupations. 

The clinical effectiveness of this alternative plan of interpreta- 

40 



COUNSELING ON BASIS OP INTEREST MEASUREMENT 


tion has been demonstrated in the experiences of graduate *,tu 
dents in supervised clinical training, and in the reaction of trained 
counselors to the plan. Students are less prone to misinterpret 
the outcomes of the interview; parents can see more clearly the 
relevance of specific educational and vocational suggestions made 
by the counselors; greater flexibility is possible in working out 
educational and vocational plans; more satisfaction is expressed 
by students with this form of counseling assistance in their voca¬ 
tional problems. 

No claim of infallibility, however, is made for the plan of 
interest test interpretation. It is not easy to learn, nor will it 
solve certain student problems of inflexible and over-emotional¬ 
ized or fixated vocational choices. It requires skillful interview¬ 
ing and careful explanations. 

There are other aspects in counseling on the basis of intei e*.t 
measurement that should be mentioned. The absence of a con¬ 
sistently significant correlation between specific occupational 
scores and either ability or achievement has already been men¬ 
tioned. Yet in these studies certain experimental problems remain 
uncontrolled. Clinicians can cite many cases in which a student 
has substantially improved his college grades when lie transfers 
to a curriculum that trains for an occupation which is within his. 
basic and primary interest type. Students transferring fiom 
engineering to business administration, from medicine to journal¬ 
ism, from chemistry to teaching, and succeeding better after stub 
transfers are familiar to all counselors. The guide increment 
cannot be attributed solely to easier academic competition in the 
second curriculum, since the second curriculum may demand no 
less general academic ability than the first, and may demand dif 
ferent types of special achievements and aptitudes than the first. 

If the interest measurement can be considered an approximate 
quantification of motivational factors, the following experiment 
would be significant. Choose a group of students having a pri- 
mary pattern and a group having a secondary or tertiary pattern 
in the interest type for which a given curriculum offers specific 
occupational training. Match cases from the two groups on the 
basis of scholastic ability. If the primary pattern in the interest 
type denotes more adequate motivation, the group having this 
pattern should earn better grades than its matched group! pro¬ 
vided no disproportionate factors of disabilities or problems load 
this experimental group. 

Furthermore, when any raw score on an interest key above 

41 



EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


—.5 sigma can receive the same A grade, there is some question 
of the legitimate use of the Pearsonian correlation in studying 
the relation of occupational interest scores to other and more 
normally distributed variables, such as ability or achievement. 
Correlation ratio or contingency coefficient statistics may be more 
appropriate forms of analysis for these data. It is too early to 
consider occupational interest factors conclusively unrelated to 
curricular factors, in the light of examples to the contrary. 


Conversely some counselors can cite cases in which superior 
or adequate grades are earned in a curriculum that trains for a 
vocation included in an interest group where the student has no 
primary pattern. Yet this need not be too alarming in the light of 
subsequent data about the occupational adjustments of graduates. 
Approximately fifty per cent of all the engineering graduates do 
not continue through life in the technical practice of engineering, 
and the chances are good that many in that fifty per cent have 
primary patterns in interest types other than the technical type. 


This leads to a final clinical apect. The interest type in coun¬ 
seling must be considered in relation to the local institution’s 
curriculum organization in educational guidance. Examples are 
more clearly seen in terms of the blank for women. Many women 
make a primary pattern in the interest type which includes the 
secretarial and office worker keys. The normal curricular path in 
college may be the highly theoretical and technical economics of the 
existing school of commerce or business administration. Yet only 
a small proportion of college girls want to swallow this large dose 
of abstruse economic theory. The primary interest pattern is still 
a true picture of the occupational activities that would be satis¬ 
fying prior to marriage; the curriculum may still be excellent 
for profession^ specialists in business, but the twain probably 
shall not meet happily. General education courses plus a mini¬ 
mum of training in basic office skills would solve the problem if 
he institution provides such a curricular organization. Otherwise 

n J atl °? Wlth a Sh ° rt C ° Urse in a comm ercial busi- 

workablefolulio ” 8 5Ummera 0r “ h °° I wiI1 * 


Other examples will occur to counselors who are familiar with 

the detailed structure of their curricula as well hJ. , 

labile 'Tka ~ , uiuud wen as the curncular 


42 



THE COURSE IN SELF-APPRAISAL AND CAREERS 
OFFERED TO SENIORS IN THE CHICAGO 
PUBLIC HIGH SCHOOLS* 

GRACE MUNSON 

Bureau of Child Study, Chicago Public Schools 

Since February, 1939, seniors in the Chicago high schools have 
been given the opportunity to enroll in a course in Self-Ap¬ 
praisal and Careers. This course, with its subsequent counseling, 
constitutes the final step in the Adjustment Service. It is the 
culmination of the self-appraisal and educational planning which 
starts early in the elementary grades, is featured in the eighth- 
grade program of articulation between elementary and high 
schools, is an important aspect of the individual counseling at all 
year levels by high-school teachers in their daily adjustment 
periods, and is featured again in the third-year program for a re- 
check on mental abilities and reading achievement. These activi¬ 
ties are given continuity by the cumulative folder system and are 
supplemented all along the way by the individual service and fol¬ 
low-up studies of both elementary and high-school adjustment 
teachers collaborating with the Bureau of Child Study psycholo¬ 
gists and demonstrators, for individual cases studies, clinical 
treatment, and consultative service. 

As the Adjustment Service has now been operating in the 
high schools since 1937 and in the elementary schools since 193(5, 
cumulative folders of the fourth-year students of September, 
1940, will contain data assembled over a period of three and a 
half years and in some cases longer. Each year the data will 
extend back farther until ultimately the complete school history 
with many successive measures of mental power and achievement 
will be available for the final guidance step. 

Given in the first half of the fourth year, the course in Self- 
Appraisal and Careers enhances and continues the self-appraisal 
of the earlier years by presenting the concepts of mental growth, 
of individual differences, and of the forces of self determination. 
It makes use of a wide range of scientific measuring techniques 
administered by the adjustment teacher or field psychologists for 

* This article is a summary of the description of the course as pre¬ 
sented in the Superintendent’s Annual Report for the year 193 g- 4 Q, 


43 



EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


the identification of specific areas of mental power, aptitudes, 
academic masteries, and vocational interests. And it teaches the 
techniques for interpreting the profiled results and the various 
conditioning factors. The self-appraisal section of the course 
gives each student a foundation in elementary psychology as a 
background for developing the techniques which will enable him 
to make continued self-appraisal as his pattern of powers and 
achievements changes with new growth and new experiences. 
The new understandings and newly accumulated data together 
with that assembled through the years are now used for making 
specific immediate plans and tentative future plans for educa¬ 
tional, vocational and avocational pursuits. 


Career Study Is Dynamic 


The careers section of the course provides studies in specific 
vocational areas using the most lecently published books and 
pamphlet series, compilations of current occupational informa¬ 
tion, regional conferences with selected speakers from icpre- 
sentative vocational areas, personal interviews with these 
speakers, radio broadcasts, and tours. Students acquire knowl¬ 
edge of the historical development of occupations, their social 
significance, legislative controls and significant trends as a back¬ 


ground for the development of techniques which will prepare 
them to continue the study of vocations on the basis of the new 
experiences and the new skills that may be acquired in the 
changing and diversified world of work. 

The careers section of the course now lacks roots in earlier 
vocational studies comparable with the early development of 
self-appraisal. The problem of adjusting the high-school curric¬ 
ulum to accommodate such a course for all students earlier than 
the senior year has been studied with great care, since too early 
selection of vocations is detrimental, yet tentative choices should 
govern to some extent educational planning in high school. 
Beginnings have been made by introducing into selected subject- 
matter courses, study units on the vocational implications of a 
par !cular subject; books on vocations have been added to the 

andTh ? rs° - Shelf f01 ' the first ~ year English Reading classes, 

informah°on th arleS d COn a tai1 ? Val " able sources of vocational 
information the individual counseling at the eighth-grade level 

nd successive high-school levels by adjustment teachers dTv 

* nd particui " iy by coZwii 

some future vocational planning with the students; but 


44 



THE COURSE IN SELF-APPRAISAL AND CAREERS 

the students, having had little opportunity to study careers, are 
unable as yet to contribute intelligently their rightful share of 
the planning. 

The development of a program preliminary to the fourth-year 
course is necessary since an earlier tentative selection of a caieer 
plan will contribute to good mental hygiene by developing secur¬ 
ity, responsibility, organization of effort, and growth in self 
determination. In this connection, plans arc being formulated to 
drop the third-year testing program to the first half of the second 
year, using the New Chicago Tests of the Primary Mental Abili¬ 
ties which will yield more diversified data as a basis for self¬ 
appraisal and counseling; to introduce more specific career 
studies in the second-year curriculum together with a study of 
the total high-school organization of courses and facilities; and 
to establish more clearly defined routines to govern individual 
program-making fiom year to year by division-room counselors. 

In the senior course the psychological studies and the careers 
studies are presented in somewhat parallel order, one vocational 
area being finally vSelected for intensive study by each student, 
after the results from the psychological measurements have been 
profiled and interpreted by him. A sample profile is presented in 
Figure I. 

There is little attempt to match a given profile pattern to a 
particular vocation since scientific research has not been able to 
map the specific mental abilities required for insured success in 
a given vocation, and since an attempt to sort individual students 
into vocational pigeon holes would violate the fundamental dem¬ 
ocratic principles of public education, Yet wise counseling com¬ 
bines with student freedom of choice based on a knowledge of 
self and of careers, to give each student the security of tentative 
but specific plans. 

The teachers of the course assist students in the formulation 
of such plans through individual counseling as the course pro¬ 
gresses, using their non-teaching periods for this purpose. In 
the following semester each student goes over his plans again 
with the placement counselor if he seeks employment immedi¬ 
ately following graduation, or with the senior counselor or the 
adjustment teacher, selecting his college and his first college 
courses, if he plans to continue his schooling. Most adjustment 
offices maintain a library of college information and scholarship 
data. The adjustment teachers arrange for senior visiting days 
at the junior colleges, and confer with their personnel staffs for 


45 



EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 

Figure I 

Profile of the test results for a student in self-appraisal and careers. 

- - ^ ' Grade- 


Period, 


_ . „ nD , 1UI 14 10 JOMW W fo W » 

TESTS Date Nouns P.K rypy-r-^^ 

MENTAL POWER -HtT HHHI - 

Primary Mental Abilities 2-40 

1. Perception (P).' 5 4-- 

2. Number (N). 35 me - 

Silent Reading^(Iowa-Adv.) 2-40^ _.S ^g||j 

TOTAL COMP.99 MuIm SjSSfifiw* ■ 

HS-Content (Iowa) 3-40 ___ JjZjll ...L.| 

1. English, Lit., Grammar. .,95 B ■ RHIM ■■ >B4 -~j 

SPECIAL ABILITIES ~jT~* 11. 

Quality . .97 ', 8 ',"', T 1 I '"i *1 l' 

Intensity ... ,, , ..94 y 1 SB * - 

Tonal Movement , . .94 fi * ' ' 55= "^ ^^^=_ 

Rhythm . 10 l j 55 _ _ ___ __ 

Pitch . ..,96 l r !■ _ > 1 g ■ ■ S eg ) ; 

Pitch Imagery.93 r J J^lg|. 

Rhythm Imagery.85 5 S ■ ■ 55 £5 ^ _ 

CLERICAL (N.I.I.P) 3-40 H~ IZZIl 

1, Oral Instructions.40 rT^~[ ^ 

2. Classification .80 | ~ 

7. Problems .98 ~^T ^ ‘ 


46 























THE COURSE IN SELF-APPRAISAL AND CAREERS 


the orientation program and the transfer of data for such stu¬ 
dents as plan to attend, They also assist in organizing “College 
Day” when representatives from local and state colleges and 
universities present the advantages of their institutions. 

The course in Self-Appraisal and Careers has been organized 
and serviced by the Bureau of Child Study and the Bureau of 
Occupational Research subject to the advice and guidance of the 
Assistant Superintendent in Charge of High Schools. Confer¬ 
ences with principals have determined policies for the outlines 
and content of the course while the teachers have contributed 
many valuable devices and suggestions. Each semester the stu¬ 
dents have made constructive criticism to improve the value of 
the course for the next group. 

Since the course is a five-hour major elective it has not been 
accessible to all seniors following the old program of high-school 
studies. The new program, which will begin to operate in the 
next semester and which allows wider choice of electives, will 
permit more students to enroll. The course should eventually 
be made available to every fourth-year student. 

The following table shows the enrollments in successive 
semesters since the course was established in February, 1939: 


Enrollments in Self-Appraisal and Careers 


Calendar 

No. of Schools 

No. of Classes 

No. of Students 

February, 1939. 

32 

74 

2600 

September, 1939, .. 

32 

69 

2500 

February, 1940.. . . . 

36 

80 

2800 


The course is taught without a textbook since no high-school 
textbook has been written that covers the psychological studies 
selected for the course and since most textbooks on occupations 
are likely to be out of date by the time they are printed. Instead, 
an extensive bookshelf of reference materials for both teachers and 
pupils is supplied, supplemented by current materials on occupa¬ 
tions. Students thus have an opportunity to read widely in the 
areas of their interests. To make the books more available to stu¬ 
dents, several books have been unitized for each school, by divid¬ 
ing them into from 14 to 43 sections re-mounted in manila covers, 
thus introducing a type of individualized instruction. This year 
a set of 10 reprints on psychological topics, written in popular 
vein by eminent leaders in that field, was supplied in class sets 
to each high school. 

Teachers’ lesson plans and outlines have been worked out and 

47 






educational and psychological measurement 


distributed to all of the schools, modified from semester to semes¬ 
ter in accordance with suggestions from teachers and pupils, 
This year, in answer to the demands, student work sheets were 
prepared to accompany the teachers 1 outlines. They were mimeo¬ 
graphed and supplied in class sets. 

The Outline of the Course 


The outline of the course, arrived at by successive modifica¬ 
tions in the light of experience, is given below. It will be revised 
still further as experience indicates the directions in which it 
may be more useful to the students and more adequate in fulfill¬ 
ing its objectives. Lesson outlines for the teacher have been pre¬ 
pared for all sections of the course. Special study guides have 
been prepared for the use of students in connection with the 
topics which are starred. Units I and II have been prepared by 
the Bureau of Child Study, Unit III by the Bureau of Occupa¬ 
tional Research, Unit IV jointly. 


Unit I, ^Introduction and Bibliography 

A. Aims and activities of the course 

B. Terminology 
C Bibliography 


Unit II Self-Appraisal (To be taught simultaneously with Unit 
^ * s suggested that each week, two days be spent on 
Umt II, two on Unit III, and one on testing. Constant inter¬ 
weaving should be practiced.) 

A. Existence of individual differences 
1. Family history and autobiography 

a. Racial and cultural background 

b. Family traits, vocations and achievements 

c. Health history 

d. Educational history 

e. Hobbies 


* 2 . 


i. social development 

g. Occupational experiences 

h. Plans for the future 

Physical and mental differences betwe 
a. Types of differences 
u The total personality 
The normal curve 
Applications to the testing progra 
Educational implications 
Vocational implications 


b. 

c. 

d. 

e. 

f. 


people 


connection vrifo^h^ topics* which^re* sta*™!I f °t thC USft ° f students in. 
teacher have been prepaid 


48 



THE COURSE IN SELF-APPRAISAL AND CAREERS 


g. Chicago's plan for the study of individual differ¬ 
ences from the kindergarten through the high 
school 

!, B. Uses and limitations of standardized tests 
1 How accurate are the test results 7 

2. How useful are the test results for prediction? 

3. Do the tests measure all one’s abilities? 

4. How can the information from them be used most 
effectively? 

5. Can the tests designate the one particular job for which 
each person is exactly fitted? 

6. Study of tests to be given in this course 

a. Description 

b. Why selected 

7. Chicago plan for the study of individual differences 
and for the development of techniques of self-appraisal 
from fourth grade through high school 

C, Psychological factors that must be considered in the inter¬ 
pretation of test results 
L Maturation and change 

*a. The process of growing up 

(1) Physical and mental growth 

(2) Laws of natural growth—infancy to maturity 

(3) Influence of the environment on growth 

(a) Effect of frustrations 

(b) Effect of social environment 

(4) Adolescence 

(5) Maturity—the learning ability of adults 
^b. Individual control of the direction of growth 

(1) Habits: our masters or our servants 

(a) Conditioned response 

(b) Deliberate reconditioning 

(2) Development of work habits 

(a) Urge to mastery and completion 

(b) Urge to self-direction 
2. Mastering our environment 

*a. Human drives and obstacles 

(1) The basic drives 

(2) Psychological bases for the emotions 

(3) Motives derived from basic drives 

(4) The complexity of motives 

(5) Motives as products of the environment, plus 
psychological factors 

(6) The universality of obstacles 

(7) Drives in career planning 

*b. Mastery and adjustment—-interaction of the indi¬ 
vidual and the environment 

•* See footnote on preceding page. 


49 



EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


(1) Successful adjustment: mastery of one's prob¬ 
lems 

(a) Adjustment of environment to self 

(b) Adjustment of self to environment 

(2) Less successful types of adjustment 

(3) Self-appraisal and the choice of adjustment 

(4) Relation to educational and vocational planning 

D. Individual interpretation of the test results 
L Necessary statistical concepts 
*a. Percentile rank 

b. Mean 

c. Median 

d. Quartile 

2. Profiles of test results to be made by the student 

a. Construction 

b. Interpretation of test data as samplings 

c. Comparison of abilities and achievements 

3. Aids in interpretation of individual performance on 
each of the following tests; 

*a. Thurstone Primary Mental Abilities 
*b. American Council on Education Psychology Ex¬ 
amination 

*c. Iowa Silent Reading Test, Advanced 
*d. National Institute of Industrial Psychological Cler¬ 
ical Examination 

*e. Cleeton Vocational Interest Inventory 

4. Aids in interpretation of the completed profile 

Unit IIL Careers and Occupations (Simultaneously with Unit II) 
A. Man’s interdependence in work 
L The growth of interdependence 

a. Primitive methods of work 

b. Development of specialization 

c. Effect of specialization 

d. Discussion of our present highly specialized work¬ 
ing world 

e. Release of human energy for cultural service, and 
leisure-time activities 

2. Evolution and importance of occupational groupings 

a. Development and significance of the merchant guilds 

b. Development of craft guilds " 

c. Later history of craft guilds 

d. Present day significance 

(1) Employee organizations 

(2) Employer organizations 

(3) Trade associations 

(4) Professional organizations 

* See footnote on page 48 . 


50 



THE COURSE IN SELF-APPRAISAL AND CAREERS 

3. Socio-economic factors in the study of an occupational 
area 

a* Questionnaire study 

b. Basic attitudes toward occupational rewards other 
than money 

4. Legislation affecting workers 

a. Social Security—old age insurance 

b. Social Security—unemployment compensation 

c. Wage and hours laws 

d. Child labor laws 

B. Significant relations and trends in occupations 

1. Classification of occupations 

a. Importance of study of occupational areas in a broad 
sense as well as of specific occupations 

b. Occupational areas vs. occupational fields 

2, Significance of trends in occupations 

a. Technological 

b. Commercial 

c. Personal and domestic 

d. Professional and semi-professional 

C. Study of an occupation 

1. Relationship between the school subjects and occupa¬ 
tions related to those subjects 

2. Graphs of life earnings 

3. Case study of an individual 

4. Intensive study of several selected occupations 
(check list or outline for occupational study) 

5. Intensive study of several selected avocations 

D. Techniques in securing and holding work 

1. Channels in finding work 

2. Written application for work 

3. Making an interview 

4. Adjusting to a job 

Unit IV. ^Summary 

A. Development of techniques for self-guidance 

1. Summarizing self-appraisal data 

2. Summarizing data for study of occupations 

B. Schedules for counseling during the ensuing semester 

1. Functions of placement counselor 

2. Functions of other counselors available in school 

3. Appointments 

C. Application of data to the solution of individual problems 
and the formulation of two plans—one a tentative long- 
range plan, and one a specific plan for immediate action, 
both to include provisions for education, vocation, and 
avocation 


* See footnote on page 48. 



EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


D k Evaluation of the course 

Student Work Sheets have been prepared to implement the 
outline of Units I, II, and IV. The topics are listed below. Over 
2,000 copies of each were distributed to students during the year 
1939-1940. 

1. Introduction and Bibliography, 17 pp. 

2. Physical and Mental Differences, 6 pp. 

3. Standardized Tests, 7 pp. 

4. Process of Growing-Up, 15 pp, 

5. Self-Directed Personality Change, 10 pp, 

6. Human Drives, 14 pp, 

7. Mastery and Adjustment, 10 pp, 

8. Meaning of Percentile Rank 

9. Primary Mental Abilities, 5 pp, 

10. Mental Power as measured by A.C.E. Test, 4 pp. 

11. Reading Ability as measured by the Iowa Silent Read¬ 
ing Test, Advanced, 5 pp. 

12. Clerical Ability as measured by the NJXP. Test, 4 pp. 

13. Vocational Interest, as indicated by the Cleeton Voca¬ 
tional Interest Inventory, 7 pp. 

14. Summarizing Self-Appraisal Data, 2 pp. 


Battery of Tests for Self-Appraisal Used in 1939-1940 


I. Mental Tests 

A, *Thurstone Tests for the Primary Mental Abilities 

1. Perception 

2. Memory 

3. Number 

4. Space 

5. Verbal 

6. Inductive reasoning 

7. Deductive reasoning 

B. ^American Council on Education Psychology Exami¬ 
nation 

II. Reading Ability 

A. Howa Silent Reading Test, Advanced 

III. Achievement Tests 

A. *Iowa High School Content Examination 

B. American Council on Education Cooperative General 
Achievement Test 

1. Mathematics 

2. Science 

3. Social Science 

IV. Aptitude Tests, as desired 
A. Clerical 


^National Institute of Industrial Psychology Clerical 
1 est, American Revision 

on ChkagTgjQups™ 5 ^ **“ PrCPaCed by ^ Bure9U ° f Child ***** 


52 



THE COURSE IN SELF-APPRAISAL AND CAREERS 


B. Mechanical 

1, Detroit Mechanical Aptitudes Examination 

2. Individual Manipulation Tests 

C. Musical 

1. Kwalwasser-Dykema Music Tests 

2. Seashore Measures of Musical Talent 

D. Artistic 

Meier-Seashore Test for Art Judgment 
V, Miscellaneous 

A. Cleeton Vocational Interest Inventory 

B. Business Education Council Personality Rating Sched¬ 
ule 

C. Kuder Preference Record 

Bookshelves for Students and Teachers 
The following bookshelves have been set up for students and 
teachers: One set has been furnished to each high school, 

References for Students 

Psychological Books 

William E The Five Sisters, New York: W. Morrow and 
Company, 1939. 

Psychological Pamphlets (30 sets to each school) 

* (Reprints from a series of radio lectures published under the title 
of Psychology Today by the University of Chicago Press, 1932.) 

Garrett, Henry E. Psychology Today 

Goodenough, Florence ... , Child Development 

Gesell, Arnold . Growth of the Infant Mind 

Watson, John B. How to Grow a Personality 

Allport, Floyd H. Personality in Our Changing Society 

Cannon, Walter B. Effects of Strong Emotion 

Warden, Carl J.. . . ... t Animal Drives 

Robinson, Edward S. Learning and Forgetting 

Thorndike, Edward L. Effects of Rewards and Punishments 

O'Rourke, L. J. Matching Men and Occupations 

Occupational Books 

fBrewer, John M. Occupations, Boston: Ginn and Company, 1937. 
fChapman, Paul W. Occupational Guidance . Atlanta: Turner E, 
Smith and Company, 1937, 

fClark, Harold F. Life Earnings * New York: Harper and Brothers, 
1937. 

fFleischman, D. E. An Outline of Careers for Women. Garden City: 
Doubleday, Doran and Company, 1935, 

Giles, I. K. Occupational Civics . New York: Macmillan Company, 
1936, 

,} "Lyons, George J. and Martin, Harmon C. The Strategy of Job F/nrf~ 
ing. New York: Prentice-Hall, Inc,, 1939, 

H Books added during 1939-1940. 
t Books which have been unitized. 

■53 










educational and psychological measurement 

National Resomces Committee Technological Trends and National 
Policy , Washington, D. C.: United States Supt. of Documents, 1937. 
Occupation a/ Outlines on America's Major Occupations * Chicago; 
Science Research Associates, 1940. 

tRosengarten, W. Choosing Your Li/e Work, New Yoik; McGraw- 
Hill, I tic., 1936. 

United States Dept, of Commerce Cemus of Business: 19SS. Wash¬ 
ington, D. C.: Bureau of the Census, January, 1937, Also Census 
of Retail Trade , 1936, 

tWilliamaon, E. G. Students and Occupations, New York; Henry 
Holt Company, 1937. 

t Ziegler, S. H. and Wildes, Helen J, Choosing an Occupation. Phila¬ 
delphia; John C. Winston Co., 1937, icvised edition. 

Occupational Pamphlets 

American Job Series. Chicago: Science Research Associates, 1700 
Prairie Avenue. 19 occupational monographs, 

Are There Opportunities for Women? 1936. 10 pamphlets. Changing 
Patterns in Occupations . 1936. 26 pamphlets. New York: National 
Federation of Business and Professional Women’s Clubs, 1819 Broad¬ 
way. 

Occupational Pamphlets. New York: National Occupational Confer¬ 
ence. A series of appraisals and abstracts of available literature. 57 
pamphlets. 

Occupational Research Reports . Chicago; National Youth Adminis¬ 
tration of Illinois, Merchandise Mait, 29 pamphlets. 

Occupational Briefs, Briefs compiled by the National Youth Adminis¬ 
tration on the occupations included in the reports above. 

Guidance Leaflets . Washington, D. C.: United Slates Printing Office, 
1936 19 pamphlets. 

Success —Vocational information seiies. Directed by Chloris Shade, 
Joliet Township High School, Chicago: Morgan-Dillon and Company, 
55 pamphlets. 

Bibliographical Helps 

Bennett, Wilma. Occupational and Vocational Guidancc—A Source 
List of Pamphlet Material. New York: H, W. Wilson Company, 1936, 
revised edition. 

^Massachusetts Youth Administration, Bibliography of Occupational 
and Apprenticeship Information, Boston: 31 St, James Avenue, 1937. 
101 pp Comprehensive list of magazine articles. 

Paiker, Willard B Books About Jobs. Published for the National 
Occupational Conference by the American Library Association, Chi¬ 
cago, 1936. 

Pi ice, Willodeen and Ticen, Zelma E. Index to Vocations , New York; 
H, W. Wilson Company, 1936, revised edition. 

Bibliography of References on Vocational Guidance for Girls and 
Women. United States Office of Education. Washington: Vocational 
Division, 1936, revised, 13 pp. Lists bibliographies, studies and in- 
vestigations. 

* Books added during 1939-1940. 
t Books which have been unitized. 


54 



the course in self-appraisal and careers 


Vocational Guide , Chicago: Science Research Associates. A monthly 
bibliography of occupational books and ai tides. 

Research Services 

Occupational Card File on Current Local Data Chicago: Bureau of 
Occupational Research, Board of Education. 

Cumulative Bulletin Series. Chicago: Bureau of Occupational Re¬ 
search, Board of Education. 

Special Research Reports . Chicago; Placement Clearance Center, a 
division of the Bureau of Occupational Research, Board of Education. 

References for Teachers 

Psychological Books 

Bingham, Walter V, Aptitudes and Aptitude Testing. New York; 
Harper and Brothers, 1937. 

Paterson, Donald O., Schneidler, Gwendolen G», and Williamson, E. G. 
Student Guidance Techniques. New York: McGraw-IIill, Inc. 1938. 
Shaffer, Lawrence F. The Psychology of Adjustment Boston: 
Houghton, Mifflin Company, 1936. 

Strang, Ruth M. Role of Teacher in Personnel Work. New York: 
Teachers College, Columbia University, 1936, 

Occupational Books 

Lincoln, Mildied E. A Short List of References on Methods of 
Teaching Occupations , New York: National Occupational Confer¬ 
ence. Mimeographed, 3 pp. Free upon request, 

Lincoln, Mildred E. and Brewer, John M. How to Teach Occupa¬ 
tions. Boston: Ginn and Company, 1937. 

The Reactions of Students 

It is too early to obtain an adequate evaluation of the course. 
If the opinion of the students is a criterion (and who doubts that 
it is an important component?) the course is highly successful. 
The reactions of students indicate their deep sense of responsi¬ 
bility at this level of high school trainings the changes in their 
viewpoints engendered by the course, and their gratitude both for 
the new knowledge acquired, and for the personal guidance from 
the fine men and women who have taught the course, A few of 
the student comments are presented below; 

“I feel I have benefited by almost every topic and dis¬ 
cussion in this course in Careers, but several parts have been 
very helpful. I enjoyed all the tests, and the experience of 
having had them helped me when I applied for a position at 
the Continental Bank. Four tests were required and one was 
almost identical to those we have been taking. All the tests 
at the bank were somewhat like those we have been taking 
and I was much more confident than I would have been, if 
the work had been new to me. Another part of the course 
that I feel has been of great help to me has been hearing 

55 



EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 

about various occupations. It is much easier to decide upon 
a vocation for yourself, one that you think you will like, after 
you hear the good and bad points about many vocations, I 
have a much clearer idea of what I would like to be in the 
future than I had before I took Careers.” 

Gloria K., Ilirsch High School 

“The survey of different vocations and professions has 
been most helpful to me. It never occurred to me that one 
vocation could branch into so many specific fields. The 
Careers class threw light on many subjects concerning which 
I was in the dark, such as the present and future demands in 
the labor market, the demand of the employer upon his em¬ 
ployee, and the amount of education needed to get along in 
a vocation,” 

. Frances Harper High School 

“This course of Self-Appraisal which you are offering is 
very good in building character, citizens, and regular ladies 
and gentlemen. Now I don't say this just to be your good 
friend because it is everything that I mentioned above and 
more. 

“One of the things that struck me most was the way you 
treat the pupils. Because there is nothing like having a regu¬ 
lar guy talking to another regular guy. 

“After three and a half years of bumming, cutting, etc., 
this course brought me to my senses. I don't know what it 
was — whether it was the tests, or the homely philosophy — 
but the course was interesting. 

“And now in ending I want to thank you for making this 
change in me. And later in life I’ll come up and give you a 
visit, Maybe I’ll be a bum on Madison Street or a big shot 
on Michigan Boulevard. I’ll always come and visit the regu¬ 
lar guy and at the same time ask him for advice.*’ 

Ted T„ Harrison High School 

“Very few of those who finish high school ever sit down 
and take an inventory of themselves. The talks students 
have with the counseling teacher make you gather your wits 
about you and make you think of how to approach your em¬ 
ployer. Those who are backward and bashful come out of 
their shell, due mostly to the reassurance of the teacher who 
gives them a boost upward,” 

Margaret I„ Marshall High School 

The most important part of the Careers course is the 
making of the Career book on the selected occupations, be¬ 
cause it helps you find out all about the occupation and to 
make sure you will fit in that line of work. After making my 
book on careers m pattern making, I found I needed to brush 
up on a few technical things. Some students found out that 

56 



THE COURSE IN SELF-APPRAISAL AND CAREERS 


they never will or could fit in the occupation they had first 
chosen, and they have had time to make a better choice.” 

Chester L., Stemmetz High School 

“Of all the courses I have had, one of the most important 
subjects for me and for the development of my character*, has 
been “Self-Appraisal and Careers,” It is a subject which we 
have to give a lot of thought to, with quite a bit of brain 
work. It has been helpful in many ways: (1) it makes one 
think fully on the future when he or she leaves education and 
goes into the open world; (2) it helps one to know people 
and to understand them much better; (3) it helps one dis¬ 
cover the vocation into which he will best fit; (4) it helps one 
get a clearer view of the world and of occupations,” 

Kathleen M., Waller High School 

Handbooks and Bulletins 

More complete information concerning the course in Self- 
Appraisal and Careers will be found in mimeographed bulletins 
available from the Board of Education in the City of Chicago. The 
bulletins may be obtained for the cOvSt of mailing (75c) by writing 
to the Bureau of Child Study, 228 N. La Salle St. 

Prepared jointly by the Bureau of Child Study and the Bureau 
of Occupational Research: 

Handbook on Self-Appraisal and Careers, 17 pp. 

Teachers’ Outlines for Self-Appraisal and Careers, 86 pp, 

Prepared by the Bureau of Child Study: 

Student Work Sheets for Self-Appraisal and Careens, 106 pp, 
Handbook of Norms, 30 pp. 

Handbook on Scoring Procedures, 36 pp. 

High-School Teacher’s Devices and Suggestions (Subject: Self- 
Appraisal and Careers, A Bulletin issued by the Superintendent ’of 
Schools), 19 pp. 

Bureau of Child Study Annual Report, Part V, High-School Self- 
Appraisal and Careers Course, June, 1939, 9 pp. 

Service Bulletin No. 1, 1940, Methods of Presenting the Course in 
Self-Appraisal and Caieers. 

Prepared by the Bureau of Occupational Research: 

Cumulative Bulletin Series: 

Series I Educational Facilities, 30 pp. 

Series II Occupational Information, 23 pp. 

Series III Significant Trends, 3 pp. 

Series IV Pertinent Legislation, 11 pp. 

57 



EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 

REFERENCES 

Johnson, William H , Annual Report of the Superintendent of School *, 
Chicago: 1937-38 P. 339, 

-, Annua/ Report of the Superintendent of Schools . Chicago*. 

1938- 39. Pp. 176-79 

- . Annua/ .Report of the Superintendent of Schools . Chicago; 

1939- 40. 

-, “Adjustment Service in the Chicago High Schools/ 1 Ec/uca- 

tional Administration and Supervision , XXIII (Oct. 1937), 513-20, 

-. “Adjustment Service in High Schools/' American SchooJ 

Board Journal , XCVI (May, 1938), 30-32, 

-. “Adjustment Teacher Service in Chicago Elementary Schools/’ 

The Elementary School Journal , XXXVIII (Dec. 1937) 264-71. 

-. “Guidance, Counseling and Adjustment/' The School Execu¬ 
tive, LVI (May, 1937), 333-341 

-. “New Self-Appraisal and Career Study Course in the Chi¬ 
cago High Schools/’ School and Society, XLIX (May 20, 1939), 627-31. 

-. “Place of Guidance, Counseling and Adjustment in the Sec¬ 
ondary Schools, “The North Central Association Quarterly, XII (Jan,, 
1938), 369-72. 

Munson, Grace, “Adjustment Service of Chicago High Schools/' Occu¬ 
pations, XVII (Feb., 1939), 389-94. 

-. “Adjustment Service—Chicago Schools," Educational Method, 

XIX (March, 1940), 327-35, 

Schloerh, Lester J., “N.Y.A. Occupational Monographs/* Chicago 
School Journal, XXI (Jan.-Feb., 1940), 180-8 L 

-, “Placement Counselors in Chicago Schools," Occupations, 

XVIII (Feb., 1940), 387. 


58 



primary mental abilities and aviation 

MAINTENANCE COURSES* 

WILLARD HARRELL, University of Illinois 
and 

RICHARD FAUBION, Air Corps Technical Schools 

This investigation is the third of a series designed to deter¬ 
mine the optimal pattern of abilities for mechanical work, The 
first study, “A Factor Analysis of Mechanical Ability Tests" (I) 
suggested that the principal component of the Minnesota series of 
mechanical tests is the Space factor, A second factor, tentatively 
identified as the Perceptual, was present in that battery, A Man¬ 
ual Agility factor was also isolated. None of the Minnesota teats 
possessed a significant weight for this Agility factor. The most 
practical conclusion from this first study was that certain paper 
and pencil tests will measure equally validly each of the factors 
present in more cumbersomcly-administercd mechanical tests, 

The second study, “Selection Tests for Aviation Mechan¬ 
ics (2),” consequently involved only paper and pencil tests. This 
second study was started after the publication of Thurstone's 
monograph, “Primary Mental Abilities (3)," but was begun before 
his Experimental Battery of Primary Mental Ability Tests (4) 
became available, Nine of the tests from the monograph supple¬ 
ment were included along with 29 other sub-tests. These were 
taken by 84 basic instruction students of the Air Corps Technical 
Schools, Basic instruction grades from each of five aviation 
maintenance courses with a total duration of eight weeks formed 
external criteria. These course grades were the criteria for both 
the second study and for the third, the subject of this paper. 

Air Corps Technical School students take these five basic 
instruction courses regardless of later specialization in radio, 
photography, airplane mechanics, parachute rigging or other 
advanced specialties. The five basic courses are Shop Mathemat¬ 
ics, Mechanical Drafting and Blueprint Reading, Air Corps Fun¬ 
damentals, Elements of Metalwork, and Elements of Electricity. 
The nameB are perhaps sufficiently definitive except for two of 

* This report is of a study sponsored jointly by the Trade Test Depart¬ 
ment, Air Corps Technical Schools, and the University of Illinois’ Gradu¬ 
ate Research Committee. The paper was read at the Mid-Western 
Psychological Association, May 4, 1940, at the University of Chicago, 

59 



educational and psychological measurement 

the courses. Air Corps Fundamentals is unlike the others In that 
it does not entail mechanical problems. It is made up of the 
study of Air Corps rules and nomenclature. Shop Mathematics 
includes the following topics: addition, subtraction, multiplica- 
tion, and division; fractions and decimals; denominate numbers 
and mensuration; formulas and tables; shop trigonometry; ap¬ 
plied problems. 

Entrance to the Air Corps Technical Schools is restricted to 
soldiers in the United States Army. Consequently the minimum 
age is 18 years. The education requirement is graduation from 
high school or the equivalent. A minimum Army Alpha percentile 
rank of 75 is required. Percentiles here are based on the Army 
population. The percentile rank of 75 corresponds to an Otis 
I.Q. of 100. 

One hundred and five soldiers, students of the Air Corps 
Technical Schools, were given Thurstone’s Experimental Battery 
of Primary Mental Ability Tests (4), and two additional tests 
found predictive in the previous study (2)—Surface Develop¬ 
ment and Punched Holes, Army Alpha scores were also available 
since they are used as an entrance requirement. About half the 
group was in the advanced phase of Airplane Mechanics, and the 
other was in Radio Mechanics. The age range was 18-39 with a 
mode of 19. The range for years of formal schooling was 9-15. 
Sixty-four had completed high school but had gone no further. 

The classification of students into the various advanced phases, 
as well as their selection, might be considered a test problem in 
part, but only the selection angle will be considered here. Re¬ 
sults are becoming available from tests given to 600 students to 
provide sufficiently large samples to trace the correlation between 
tests and several advanced phases. 

It is recognized that the course grades are not perfect criteria. 
They are complex, but since they consistently correlate signifi¬ 
cantly with several tests, they probably possess a reasonable 
amount of validity. One objective criterion—a machine shop prod- 
uct has been developed which is hoped to have a satisfactory 
reliability. Other practical criteria and objective information 
criteria are planned. 

The reliability has been estimated by the split-half method for 
each of the sub-tests correlating as high as .30 with a criterion. 
These coefficients are shown in Table III. 

Only two of the seven Alpha sub-tests with significant correla¬ 
tions with any grade, namely Addition and Analogies, have a 


60 



PRIMARY MENTAL ABILITIES AND AVIATION 


reliability above .90. Alpha Arithmetic with reliability of .60 
is lowest. 

Three of the Primary Mental Ability tests, Completion, Arith¬ 
metic, and Number Series have reliabilities of less than .90 but 
more than .80. From an item analysis, showing the correlation of 
each item in the PMA battery with total sub-test score, the 
relatively low reliability in these three cases is probably due 
in part to the items not being arranged in order of their difficulty. 

Four of the PMA tests, Addition, Same-Opposites, Cards, and 
Figures, have reliabilities above .97. These high reliabilities may 
be partially explained by the items within each of the tests being 
practically of equal difficulty. 

Comparison of A. C. T. S. Students with High School Seniors 

A comparison has been made between the PMA scores for Air 
Corps Technical School students and the norms published for 300 
Hyde Park High School (Chicago) seniors. Table IV shows this 
comparison. Critical ratios have been calculated from the differ¬ 
ences between means. CR’s for Number and Memory are less than 
.30. Hyde Park seniors have higher Perceptual, Verbal, and 
Induction scores. Air Corps Technical School students have 
higher Reasoning and Space scores. 

It is difficult to interpret these results because it is not pos¬ 
sible to say exactly what selective agents are at work in the Air 
Corps Technical School. The most obvious ones are, being a 
soldier, choice by a commandant which presumably means interest 
in mechanical work, completion of high school, and having an 
Alpha Army percentile rank of 75. 

A difficulty with the Reasoning or D score is that one of the 
tests, Mechanical Movements, on which the D score depends, also 
possesses a significant weight in another factor which from ax; 
unpublished factor analysis by the writers seems to be Knowl¬ 
edge of Mechanical Processes. Since the present group is 
selected in part on their interest and, presumably, knowledge of 
mechanical processes, this would increase the Mechanical Move¬ 
ments score and consequently the D score, without demonstrat¬ 
ing that they are better reasoners than the Hyde Park seniors. 

Results with Primary Mental Abilities Tests 

All of the PMA scores were obtained from adding test scores. 
Five of the seven, all but Perception and Memory, correlate sig- 

61 



EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 

mficantly with at least one of four basic instruction grades 
Table I lists the product-moment correlation coefficients A 
significant correlation is considered to bo one where the coeffi¬ 
cient is at least four times its probable eiror For 105 cases this 
Is a correlation of 24 

Elements of Metalwork does not correlate significantly with 
any of the tests, but it did m the second study referred to above. 
The test correlations with Shop Mathematics and with Meehan- 
ical Drafting appear qmte similar to those m the previous study. 
In both groups, Addition, Number Series, and Surface Devel¬ 
opment correlated significantly with Shop Mathematics; and In 
each study Mechanical Movements, Smface Development, and 
Punched Holes with Mechanical Drafting. There are stronger 
correlations with Electricity, and with Air Corps Fundamentals 
in this study than in the second A possible explanation is that 
the present battery has more Verbal tests, and these correlate 
significantly with each of those two courses. More important is 
that the greater dispersion for age and schooling of the present 
group tends to increase the correlations. 

Looking again at Table I, the Number factor correlates sig¬ 
nificantly only with Shop Mathematics; Space correlates signifi¬ 
cantly with Shop Mathematics, and with Mechanical Drafting; 
Induction with Shop Mathematics, Electricity, and Mechanical 
Drafting; while Reasoning and the Verbal score correlate sig¬ 
nificantly with each of the four basic grades 

Mutiple correlation coefficients using only significant zero- 
order coefficients have been computed between PMA scores and 
each of four basic grades These may be compared with correla¬ 
tions of Alpha total with the four criteria grades. The multiple 
R's from the factor scores are 46 with Shop Mathematics, 57 
with Electricity, .60 with Mechanical Drafting; and ,36 with Air 
Corps Fundamentals Corresponding values for Alpha total are 
,31, 47, .30, and .41 These multiple R’s, as well as others to be 
mentioned later, would be expected to be less in other samples by 
the shrinkage effect if the same regression formulas were used 
The multiple correlation between four factor scores, Verbal, 
Space, Induction, and Reasoning, is 63 with a composite basic 
grade obtained from adding grades in Shop Mathematics, Elec¬ 
tricity, and Mechanical Drafting, Alpha total correlates ,45 with 
this same composite The zero order correlations with this com¬ 
posite grade are given in Table VI. Table V shows the inter- 
correlations of five factor scores. 


62 



PRIMARY MENTAL ABILITIES AND AVIATION 

Results are shown in Table II for those sub-tests which cor¬ 
relate 30 or more with one of the basic grades This is five times 
the probable error 


Conclusions 

The Air Corps Technical Schools are planning to supplement 
their test selection in line with these results; and they also expect 
to establish test standards for classification from future studies. 

We have come to the conclusion from this and other studies 
that there is no one separate factor for a mechanical ability 
Rather, there are several factors which are more or Jess prom¬ 
inent in mechanical work, their pattern depending on its type 
and complexity and on the point reached in the learning curve 

A Perceptual factor, although piesent in several so-called 
Mechanical Aptitude tests, is probably related to mechanical 
work, borrowing an expression from Holzinger, as an Arti-factor 
The Verbal factor has been shown to be evident in training for 
mechanical work of relatively great complexity Among the more 
important factors in mechanical operations are Space, one or 
two Reasoning factors, and Knowledge of Mechanical Processes 
A Manual Agility factor is present in routine jobB where indi¬ 
vidual differences depend on the manipulation of objects such 
as nuts and boltB 


TABLE I 

Product-Moment Correlation Coefficients Between 5 "Primary Mental 
Ability” Scores and 4 Aviation Maintenance Courses for 105 Soldiers* 



Shop 

Math 

Elec¬ 

tricity 

Blue Print 
Reading and 
Mech Draftg 

Air Corps 
Funda 
mentals 

14 N (Addition, Multiplication) 

37 

17 

00 

U 

17 V (Completion, Same Opposltea) 

28 

51 

37 

33 

20 S (Cards, Figures) 

25 

17 

36 

02 

27 I (Letter Grouping, Marks, 





Number Patterns) 

33 

29 

41 

20 

31 D (Arithmetic, Number Series, 





Mechanical Movements) 

26 

40 

54 

24 

PE r —,05 wheie r=,50 





PE r — 06 where r = ,20 






* Decimal points have been omitted before each coefficient, 


63 






EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


TABLE II 

Product-Moment Correlation Coefficients Between 19 Teats atul 4 
Aviation Maintenance Courses for 105 Soldicis* 



Shop 

Mnth 

Elec 

Incily 

Blue Print 
Rending and 
Mceh Draft r 

Air Corps 
Fumlft 
mentals 

1 Alpha Addition 

33 

06 

- 10 

-04 

2 Alpha Arithmetic 

21 

35 

30 

27 

3 Alpha Common Sense 

13 

31 

06 

06 

4 Alpha Word Opposites 

13 

43 

24 

38 

5 Alpha Mixed Sentences 

17 

43 

23 

34 

6 Alpha Number Senes 

37 

15 

24 

| 07 

7 Alpha Analogies 

26 

35 

23 

i 39 

12 Thurstone Addition 

39 

1 23 

08 

24 

15 Thurstone Completion 

23 

47 

39 

21 

16 Thurstone Same Opposites 

27 

45 

30 1 

34 

18, Thurstone Cards 

23 

ia 

32 1 

00 

19, Thurstone Figures 

21 

12 

33 

05 

24 Thurstone LeLLer Grouping 

23 

25 

31 

22 

26 Thurstone Number Patterns 

30 

18 

32 

05 

2B ThurBtone Arithmetic 

31 

i 49 

49 

29 

29 Thurstone Number Series 

22 

33 

39 

17 

30 Thurstone Mechanical Movements 08 

n 

40 

10 

32, Thurstone Punched Holes 

15 

16 

41 

06 

33 Thurstone Surface Development 

35 

21 

50 

02 


* Decimal points have been omitted before each coefficient. 


TABLE III 


Test Reliabilities by the Split-Halt Method (Stepped-up) 

N=103 


1 

Alpha Addition 

98 

19, Thurstone Figures 

99 

2 

Alpha Arithmetic 

60 

24 Thuistone Letter Grouping ,91 

3 

Alpha Common Sense 

87 

26 Thuistone Number 


4, 

Alpha Word Opposites, 

88 

Patterns 

92 

5 

Alpha Mixed Sentences 

BO 1 

2B, Thurstone Authmetlc 

,87 

6 

Alpha Number Senes 

84 

29, Thurstone Number Series, 

.87 

7 

Alpha Analogies 

93 

30 Thuistone Mechanical 


12 

Thurstone Addition 

98 

Movements 

93 

15 

Thurstone Completion , 

.80 

32, Thurstone Punched Holes 

89 

16, 

Thurstone Same-Opposites 

99 

33 Thurstone Surface 


18 

Thurstone Cards , 

99 

Development , , 

,95 


64 



PRIMARY MENTAL ABILITIES AND AVIATION 


TABLE IV 

Comparison Between 300 Hyde Park High School Seniors and 105 Air 
Corps Technical School Students in "Primary Mental Abilities" 


Hyde Park ACTS 

Seniors Students CR 

Mean S D Mean S D 

~152 ~ ~ 23 7S 137 09 16 51 7 04 

119 5 30 00 118 78 27 80 0 22 

84 5 27 50 75 16 19 95 3 72 

109 5 35.00 125 23 34 14 4 04H- 

15 5 7 25 15 65 7 67 0 18-h 

35 5 9 00 28 46 8 76 7 04 

54 5 18^5 68 85 18 48 6 69 h 

TABLE V 

Product-Moment Con elation Coefficients 
Among FactOL Scoies foi 105 Soldieis 

' N y" ~ 1 
‘V 31 “ 

S 30 15 

I 33 28 39 

D 20 33 41 54 

TABLE VI 

Product-Moment Correlation Coefficients Between Teats and a Composite 
Basic Grade Composed of Shop Math, Elcctncity, anti 
Mechanical Drafting 




N = 

105 



1 

Alpha Addition 

if 

' 19 

Thurstone Figures 

,29 

2 

Alpha Arithmetic 

,34 

20 

Space 

34 

3 

Alpha Common Sense 

17 

24 

Thurstone Letter Grouping ,34 

4 

Alpha Opposites 

.31 

26 

Thurstonc Number 


5 

Alpha Mixed Sentences 

32 


Patterns 

,36 

6 

Alpha Number Senes 

35 

27 

Induction 

44 

7 

Alpha Analogies 

,35 

28 

Thuratone Arithmetic 

.53 


Alpha Total 

45 

29 

Thurstonc Number Series 

,39 

12 

Thmstone Addition 

.29 

30 

Thurstonc Mechanical 


14, Number 

23 


Movements , 

26 

15 

Thurstons Completion 

.44 

31 

Deduction , 

50 

16 

Thurstons Same-Opposites 

41 

32 

Thurstonc Punched HoleB, 

,31 

17 

Verbal 

47 

33 

Thurstone Surface 


18 

Thurstonc Cards 

32 


Development 

47 


Perception 

Number 

Verbal 

Space 

Memory 

Induction 

Reasoning 


65 



EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 

REFERENCES 


1 Harrell, T W “A factor analysis of mccJiniutal ability IchIV' psycho- 
metnka, V, 17-33 

2, Harrell, T W and Faiibion, H W "Selection t cm!h foi aviation me- 
cbmica 11 Journal ot Consulting Psychology, in pres*. 

3 ThmstDiie, L L “Primary mental abilities" Psyi/iomotric Mono 
graphs, No I Chicago University of Chicago Picas, 193B 

4, Thurstone, L L Manual of instructions Tests lor primary mental 
abilities Washington American Council on Education, 1933 


66 



A COMPARISON OF THE ORIGINAL AND REVISED 
STANFORD BINET INTELLIGENCE SCALES 

MARTIN L REYMERT AND RALPH K MEISTER 
The Mooseheart La&oratoiy for Cfnfd Research 

The present study is an attempt to compare the original and 
the revised Stanford-Binct Intelligence Scales The data have 
been obtained from 440 Mooseheart children, each oE whom has 
had from two to nine examinations The population of tests 
comprises 95B administrations of the onginal scale and 823 
administrations of the revised, The testing was done by trained 
clinical psychologists of the Laboratory staff The children are 
all normal and have been drawn Evom eveiy state m the Union, 
predominantly fiom the Middle West 

The following items weie lccorded, The child’s name, birth- 
date, the dale of administration of the test, the I Q laling 
obtained, the MA, the C.A , the basal yeai scoie, the highest 
level of success and the amount of scattei The time mtcival be¬ 
tween admimstiationb and the duection and amount of deviation 
from the first I Q lating to the second wcic obtained for each 
pair of successive administrations 

Tg compare the equivalence of ratings fiom one scale to the 
other with the respective reliabilities of the scales, using the 
same population, two groups of children were chosen All had 
had at least two examinations with the original scale and two 
with the revised, However, Group A of Table I had taken the 
L foim of the revised scales fust while Gioup B had taken the 
M form first, 


TABLE I 

Correlations Between the Vaiions Forma of the Stanfard-Binet Scales 

for Constant Populations 




Group A | 

1 Group B 

Scales Correlated 

6, Oi 

’ o,L 

LM 

Oi Oq 

OjM 

ML 

N 

84 

a 

84 

41 

41 

41 

r 

83 

86 

90 

,90 

69 

,89 

Av Age at First Test (in ycare) 

89 

10 4 

11 8 

9,3 

10 3 

114 

Av Int beLwecn Tests (in years) 

1 6 

13__ 

J_1 _ 

1 0 

1 1 

1 2 


G7 






















educational and psychological measurement 

The reliability coefficients for the original scale in both 
groups are above .BO. So are the coriclalions between the two 
forms of the revised scales which have been considered here as 
analogous to reliability coefficients. The correlations between 
the original scale and each form of the revised, which give an 
estimate of the equivalence of ratings are not significantly dif¬ 
ferent 1 although the correlation with the M form is lower. Thus 
it can be said that the reliabilities for both scales and the correla¬ 
tions between both scales are essentially high and equal 

Table II, which shows the correlations between the original 
scale and the two forms of the revised, and their respective reli¬ 
abilities, using all the population that was available with no 
attempt to keep the composition of the groups the same, gives 
estimates that are all high with but one exception 


TABLE 11 

Correlations Between the Various Forms of the StanTonI-Uinet Scales 
for Populations of Variable Composition 


Scales Correlated 

Oi 0, 

Li L«. |Mi M-. 

OL 1 

l 

OM 

LM | 

ML 

N 

118 

85 

' 

44 

11G 

G1 

MG 

89 

r 

aol 

85 

] 89 

& 82 

.7G 

88 

' 60,! 

Av Age at First Test (in years) 

10 3 

97 

10 0 

11 8 

12.1 

109 

10.3 

Av Int betweenTests (In yeara) 

13 

r 

1.9 

19 

| % 8 

| 2 7 

1 2 

1.0 


* An administration oE an alternate form was included between the 
two forms correlated Therefore these correlations arc not between suc¬ 
cessive administrations as are the others 


The estimate of correlation between the M form and the L 
form of the revised scale is significantly lower than any of the 
other estimates However, since an estimate of this same correla- 

*No PE, 1 b given m this study since an improved technique (Ridei, 
10 pp 64-85 ) has been used to determine whether the difference between 
correlation coefficients Is significant. The use of the P.E., is n crudoly 
approximative method at beat and in this particular case it Lb erroneous 
since the assumption of a normal distribution of correlation coefficients 
lb probably violated m this case 


68 



ORIGINAL AND REVISED STANFORD BINET SCALES 

- - '■I 

tion obtained m Group B (Table I) is high, ,89, the lower coeffi¬ 
cient heie, may be due to the particulai sample taken, 

Another view of the equivalence of ratings from scale to 
scale can be obtained from a study of the deviations fiom admin¬ 
istration to administration, The lesults presented in Table II, 
where the grouping is accoidmg to I Q classification, show that 
within each scale and between scales the individuals with the 
lowest I Q's gam most upon retest, those of average I Q gam 
some, while those of highest I.Q actually lose, 


TABLE III 

Deviations Between Ratings in Successive Admimstiations Accoiding 

to I Q, Classifications 


Scales 

- - 

- — - 



- - — 





Administered 


Oi Oi 


L 

M or M 

L 

OL or O] 

i _ J 



Below 

90 to 

110 ami 

Below 

90 to 

110 and 

Below 

90 to 

110 nnd 

t Q Level 

90 

110 

! nbove 

90 

110 

above 

oa 

J10 

wbove 

N 

ID7 

32B 

66 

63 

197 

12-1 

9-1 

1 

193 

50 

N (Tioa ) 

109 

155 

Vl 

SB 

123 

GO 

60 

133 

19 

N (ncjr) 

75 

150 

42 

20 

SO 

GO 

23 

55 

31 

M fpoB) 

G 9 

G 5 

7 B 

76 

G G 

5 0 

D l 

9 9 1 

1 G 

M (nefl ) 

4 1 

5 1 

9 l 

j 4 A 

3 a . 

G 1 

4 2 

5 2 

•— i 

3 0 

1 M | 

5 3 

5 7 

8 4 

1 65 

5 3 

5 4 

7 7 

6 3 

? 5 

M 

H 2 2 

+ s 

-3 2 

I I'M 

130 

- 5 

5 G 

46 

- 1 3 


We have here the expected tendency for the extremes of the 
distribution to migrate toward the mean with successive retest- 
mgs (regression). 

In Table IV where the deviations are classified according to 
the length of interval between tests, the mean of the absolute 
deviations increases as the interval becomes longer in every case 
but one 

In that case, this reversal of tendency may be discounted in 
view of the small number of cases (10), It is concluded that the 
longer the interval between successive administrations, the 
greater the discrepancies in the ratings 

Table V, which gives for both scales the relation between 
the size of deviation and the number of tests taken, shows that 
the mean of the absolute deviations decreases slightly with suc¬ 
cessive tests for the revised scale and does the same for the 
original scale with one exception 


69 


EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 

TABLE IV 

Deviations in Ratings According to the Length of Interval Between 

Administrations 


Seal ci 


□ i Oi 


I/M or ML 

OL d OM 

Ad minis tired 










Inter Jtil 

Leu 

l year 

2 yciira 

Uu 1 

1 ytor 

2 years 

T r-fli 1 

1 year 

2 years 

between testing 

tlinn 1 

to 

anil ; 

Llinil | 

to 

arul 

limn | 

lo 

nivA 

(In years) 

1 yenr 

2 years 

hUoyt j 

1 year j 

3 yenii 

ufoavc 

l Ifcnr 

1 ytttia 

nbove 

N 

95 

46 0 

34 

93 | 

297 

ID 

24 

IG4 

151 

N (pofl ) 

S3 

213 

7 

S6 1 

175 

3 

IS 

109 

m 

(neg ) 

35 

i 

2 JO 

25 

33 

106 

i_^__ 

J 

9 

53 

35 

M (po« ) 

72 

C 7 

4 G 

S * 4 

<5 7 

5 3 

S 9 

S3 

| L0 6 

—_—i- 

M (neg) 

4a 

5 G 

in 

I 4,5 

4 9 

7 0 

4 9 

5 l 

1 * 5 

[m| 

56 

59 

65 

5 a 

57 

M 

77 

j 7 2 

| 90 

M 

!-r 

*° 

r 

J 8 

| -4 6 

i H3 

146 

1 - 5 

| 3. 6 

1 13 a 

! 169 


TABLE V 

Deviations in Ratings Between Successive Administrations in Relation 

to the Numbei of TcbIb Taken 


Devotions 

1 

Original Scale j 

Kcvlutil Scale 

Oi 0) 

O, Oi 

Oi Oi 

1 

1 P J 
1 

p ' 
i 

I i Mi 

W Mi Li 

Mi Li 
or L| Mi 

N 

244 

166 

163 

64 

133 

60 

N (poa ) 

109 

B7 

52 

28 

SB 

32 

N (nag ) 

124 

65 

48 

54 

36 

33 

M (po B ) 

83 

61 

54 

5,7 

5 4 ! 

4S 

M (neg ) 

a i 

SB 

A 7 

5 2 

54 

3 9 


6B 

j 5 5 

S 1 

53 

5 B 

4 1 

M 

+61 

+ 9 

+ 5 

- 1 

,l_ -12 7 

1 4 3 


In. general, with continued retesting, the discrepancies between 
successive administrations tend to become Blightly smaller. 

Table VI, which presents the deviations according to the 
chronological age of the individual at the time of the first test, 
shows a different trend of deviations with age for each scale 


70 



ORIGINAL AND REVISED STANFORD BINET SCALES 

TABLE VI 

Deviations in Ratings According to the Age of the Individual 

at the First Test 


Bcalci 

Administered 


Oi 0) | 

LM or ML j 

OL or Of 

4 


Delow 

fl Lo 

marngm 


8 [0 1 

Above I 

Below 

a to 

Above 

Aflc Level 

8 

10 

HU 

8 

10 


8 

10 

10 

N 

205 

2 J 0 

i 

157 

GO 

103 

22G 

36 

100 

200 

N (poa ) 

D7 

92 

78 

*12 

00 

ICi 

17 

61 

165 

N (neg ) 

i 

102 

108 

| fie 

21 

1 

32 

B7 

IS 

37 

40 

M (pos ) 

G 2 

S 0 

7 3 

8 0 

fi 6 

C 1 

G 2 

9 1 

10 1 

M (ires ) 

60 

S5 

5 0 

1 5 7 

4 2 

J 5 

6 3 

4 7 

4 8 

| M| 

5 9 

0 7 

5 a 

l~r 

5 5 

| 4 3 

G 1 

1 71 

86 

M 

0 

1 1 

1 5 1 3 1 

1 i g 

1 1 <1 

3 

1 3 0 

60 


With the original scale, the mean of the absolute deviations 
is a maximum in the mLddle age group; with the revised scale, it 
decreases with age; and, between the original and revised scales, 
it increases with age In considering the net gain upon retesting, 
it is found that for the onginal scale there is a small increase in 
net gam with age, Foi the revised scale there is a decrease in 
the amount of gain, and between the oiigmal and the revised 
there is a substantial increase in net gain with age, In no instance 
is there a net loss 


Changes m Dispersion 

In studying changes in dispeision of the I Q distributions 
from test to retest in order to estimate how well the test will 
discriminate between members of a group upon retest, it was 
thought desirable to keep the population in any paitlcular com¬ 
parison constant to avoid any change in dispersion due to a 
change in the composition of the gioup Four groups were used 

Groups A and C show the changes in dispersion on the orig¬ 
inal scale with one and two letests respectively; Groups B and D 
do the same for the revised scale. With successive administrations 
of the revised scales, the standard deviations decreased and in 
Group D this decrease was significant, This is what might be 
expected since there should be a regression toward the mean 
upon retesting To the extent that there is regression a given test 


71 










EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


TABLE VII 


Changes in the Dispersions of LQ DiBtubuUons with Retesting 


Group 

A 

E 

i 


c 

_J 


D 


Scale 

Di 

O, 1 

1 Li or 
Ml 

Mi DT 
Li 

Oi 

Oi 

Oi 

Li or 
Mi 

1 M s or 
i Lj 

Li 

Mi 

N 

79 

j 79 

na 

ns 

195 

m 

| 10$ 

127 ' 

| 127 1 

127 

M 

06 0 

| 109 4 

100 4 

193 1 

93 7 

91 1 

| 91 S 

| 100 r > 

1 103 1 

1 |>t 4 

104 7 

0 

| 13 6 

14 0 

\ H 7 

H4 

11 0 

| 121 

I 12 3 

| IS 1 

I 14 0 


discriminates less well between the members of a group upon 
retest 

However, in the original scale, the sUndnid deviation on the 
first test distribution is smaller and significantly smaller than 
those of either the second or third administrations This is true 
for both Groups B and D These latter results may seem con¬ 
trary to expectation, but it should be lcmembered that these 
retests occurred a year later on the average and thus the child 
was a year older, It ia known that there is nu increase in vai labil¬ 
ity with mean test performance In othei words, as children 
grow older they tend to be more variable, The operation of this 
factor tended to mask the predilection for the disti ibilllon to 
regress toward the mean in the original scale, while m the revised 
scale the regression toward the mean was sufficiently great to 
obscure the opposite tendency, From a practical standpoint, then, 
it appears that with a given group, the discriminal ability of the 
original scale increases slightly upon ietest while that of the 
revised scale decreases. 

In the investigation of scatter, this term will be defined as the 
number of age levels through which an individual had to be 
tested to obtain his rating, from the level at which he passed all 
tests to and including the one at winch he failed all, His scatter 
as defined above is larger than his range of successes by one age 
level, Scatter is used here as an approximate measure of the time 
taken to administer the test, 2 In general, the more levels over 

which an individual scatters, the longer it takes to administer the 
test 

The amount of scatter is limited by the number of age levels 


A more direct measure dI the time required, such as the use of a 
atop-watch, could not be employed since this study was obtained from 
records which did not contain such information 


72 



ORIGINAL AND REVISED STANFORD BINET SCALES 


present in the test and by the age level at which the subject 
obtains lus basal yeai score, i e, an individual getting a basal 
score at Year XII on the original scale cannot scatter more than 
four test groups since there are no more For this reason, the 
amount of scatter for both scales has been analyzed according 
to the basal year scores obtained This airangement makes 
explicit any limitation of scattei by the ceiling of the test 


TABLE VIII 

Amount of Scatter Classified Accoidmg to the Basal Year 

Scoica Obtained 


Bna<U Year 
Score of 
TnilLvldimlB 

III 

IV 

V 

VI 

VII 

VIII 

IX 

X 

XI 

Xll 

xm 

Original 

N 

Gfl 


97 

119 

157 

130 

91 

no 

- 

'll 

- 

Scale 

Mean 

5 2 

•> 

5 5 

5 G 

5 B 

5 G 

5 I 

4 7 

- 

4 0 

— 

Revised 

N 

12 

10 

25 

n 

CH 

115 

B7 

B1 

70 

56 

02 

Seales 

Mean 

6 1 1 

5 3 

5 5 i 

5 9 

70 

7 3 

1 7 1 

6a 

G 5 i 

1 60 

5 5 


XIV 


o 


10(5 
4 a 


Table VIII shows that the scatteL increases to a maximum in 
the middle range, The maximum scatter is at basal year VII for 
the original scale and basal yeai IX for the revised scale It is at 
these points that the ceiling of the test begins to limit the amount 
of scatter Since theie are fewer test groups at the highci age 
levels in the original scale, it might be expected that tills ceiling 
would make its influence felt earlier, This is the case. The re¬ 
vised scale has the greatei scatter throughout, piobably as a 
result of the increased number of tests in it At basal age seven, 
this difference in scatter which has been only slight increases 
somewhat 

According to these results it would seem that the revised scale 
in general takes a longer time to administer This is in agreement 
with the results reported by Krugman (6) No evaluation can be 
made of this finding, since it is not known to what extent the 
longer testing time lesulls in increased accuracy of the rating 
obtained, 


Inversions in Bass! Year Scores 
Inversions in basal year scores, le, instances in which an 
individual on a later test makes a lowei basal year score than on 
his first, were studied since they cast some doubt upon the assump¬ 
tion that an individual would answer correctly all those Items 
below his basal year level In the original scales such inversions 


73 



EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 

occur in four per cent of the total possible instances (24 times 
out of 589) In the revised scale, they occur in nine per cent of 
the total possible instances (35 times out of 412) This difference 
is not statistically significant 

Inversions m Item Level 

Inversions in item level or skips, i e, instances in which the 
individual failed all the tests at one age level hut succeeded on 
one or more tests on a higher level weie noted The assumption 
in testing is that the individual will not succeed m any test 
beyond the level at which he fails all This assumption and the 
pressing demand of time economy militate against testing the 
child beyond the level at which he fails all the tests so that these 
skips are not so frequent as they might otherwise be To the 
extent that the child is not given opportunity to perfoim on tests 
at a higher level where he sometimes achieves a inndom success, 
his rating is an undeiestimation of his true ability 

In the original scale such skips occurred in four pei cent of 
the tests (40 times out of a passible 958), In the revised scales 
they occurred a little less than one pei cent of the time (eight 
times out of a possible 823) The difteience between these propor¬ 
tions is significant, Apparently the grouping of tests, fiom the 
viewpoint of avoiding such skips, has been much better m the 
revised scales 

Validity of Mental Year Groupings 
To test whether the grouping of test items by mental years if, 
such as to represent for each year'h grouping the normal per¬ 
formance of children of that chronological age, the basal year 
scores were analyzed according to the average age of children 
achieving those scores Table IX, giving the mean chionological 


TABLE IX 

Average Ages o£ Children Making Various Basal Year Scores 


Bnaal 



i 


1 ■ - 


—-- 

-- 



_ 

” 2 . 

- 

Year 


111 

IV 

V 

VI 

VII 

VIII 

IX 

X 

XI 

XII 

XIII 

XIV 

Original 

N 

1 

67 

51 

as 

. 

llfl 

144 

i 

124 

74 

UG 

1 

21 


1 

Scale 

M 

50 

63 

72 

9 0 

i 

99 | 

11 3 

1 11 9 | 
1 

13 o 


13 0 




0 

1 0 

1 1 

1 2 

1 16 

1 

1 3 | 

J 5 

1 5 

C[ 2 ~ 


8 



Revised 


9 

is 

23 

72 

57 

107 

87 

01 1 

l iV 

SH 

1 61 

92 

Scales 

M 

4 6 

55 

6 9 

0 5 

-L - _ 


11 5 

17 2 

13 | 

■ 

14 4 

| 15 7 

16 4 

1 

-. 

0 

a 


1 0 

i 8 

1 2 2 

_2_1 

2 3 

J_7 

1 i 7 

1 2 3 

I 7 

rrr 


74 











ORIGINAL AND REVISED STANFORD BINET SCALES 

ages of children making the various basal year scoies, shows that 
the mean ages are lli every case significantly higher than the ycai 
level indicated by the score 

Within the same basal year group, the mean age for the 
original scale ls in no case significantly diffeienl from the mean 
age for the revised scale, indicating that though both scales do 
not meet the fmegoing criterion for the grouping of test items, 
one is no belter than the other 

Summary 

The data weie gathered from 440 normal children who had 
taken a total of 958 original and 823 revised Stanfoid-Bmot exam- 
mations The results indicate that the lliabilities for boLh 
scales are high, over 80, and the couelations between scales are 
comparably high In both scales, clnldien with low I.Q Lend 
to gain more upon ietests than do the children of average I Q 
while those of above-average I Q actually tend to lose upon 
retesting For both tests, as the interval between successive 
administrations increases, so do the discrepancies between the 
test ratings For both scales, as moie tests aie taken, the discrep¬ 
ancies between latei tests lend to be smallei than those between 
earlier tests 

In the onginal scale the mean of the absolute deviations is a 
maximum in the middle age iange, foi the revised scale it de¬ 
creases with age 

For the original scale there is a small increase in net gain 
with increasing age In the revised scale there is a decrease Jn 
the amount of gain 

The dispeision of I.Q 's and therefoie the discnminal ability 
of the test mcieases with successive tests on the original scale; 
on the revised, however, the dispeision decreases 

The scatter on the revised scales is greater and reaches its 
maximum later than on the onginal scale, 

Inversions in basal year scores are more frequent in the 
revised scale while slaps are more frequent in the original scale 
Basal year test groupings on either test do not represent the 
normal peiformance of children of the corresponding age but 
rathei of children a year or two older. 

BIBLIOGRAPHY 

1 Berger, Aithur and Specvflck, Morns "An Analysis of the Range of 
Testing and Scattering Among Retarded Children on Form L of the 
Revised Stanford-Binot” Journal of Educational Psychology, XXXI, 
1-39-44 


75 



lUNAL AND PSYCHOLOGICAL MEASUREMENT 


2 Berm cuter, Robert G and Carr, Cdwaid J "The Interpretation of 
IQ’s on the L M Stanford-Biiiet ” Journal of Educational Psychol¬ 
ogy, XXIX, 312- H 

3 Cailton, Thcodoic “Pciformancc of Mental Defectives on the Re¬ 
vised Stanfoid-Binet, Form L" Journal of Consulting Psychology, 
IV, 2 61-5 

4 Harnman, Philip Lawrence “Irregularity of Successes on the 193? 
Stanford Revision” Journal of Consulting Psychology, III, 3 B3 5 

5 Hikheth, Gertrude "Rttcsts with the New Stan fmd-B met Scale" 
Journal of Consulting Psychology, III, 2 49*53 

6 Krugman, Morns “Some Impression*) or the Revised Suniford-Biuct 
Scale ” Journal of Educational Psychology. XXX, 8 594*G03 

7 Munson, Giace anil Saffir, Milton A “A Comparative Study of Rc 
test Ratings on the Original and Revised Stanfoid-Bmct Intelligence 
Scales." Papei delivered at the American Psychological Association 
Meeting m California, 1940 

8 Rheingold, Harriot L and Perce, Fiances C “Compai iboii of Ratings 
on the Original and the Revised Staufoid-Bmct Intelligence Scales 
at the Borderline and Mental Defective Levels" Proceedings, from 
the American Association on Mental Deficiency, XLIV (1939), Z' 
110-19, 

9 Rider, Paul R An Introduction to Modern Statistical Method’: New 
York John Wiley & Sons, Inc,, 1939, 

10 Spearman, Charles “Measmmg Intelligence—A Ciideal Notice 11 
Human Factor (London), XI (1937) 

IK Terman, L, M, The Measure mem of Intelligence Boston Houghton* 
Mifflin, 1916, 

12, Terman, L, M and Mcrull, M A Measuring Intelligence Boston 
Houghtcn-MjfTlin, 1937 


76 



THE PREDICTION OF SCHOLASTIC SUCCESS IN A 
COLLEGE OF MEDICINE 

DEWEY B STUIT 
University of Iowa 

The prediction of scholastic success in the professional colleges 
is a major personnel pioblem and one of primary significance to 
the individual, the colleges and society as a whole Satisfactory 
achievement m the piofessional courses is the first step toward 
vocational success Unless an individual can perform satisfactory 
the work required for a professional degree, the question of ultimate 
vocational success need not be raised 

If the individual can be informed of his chances foi success in a 
professional college before he enrolls it should be of gieat advan¬ 
tage to lum in terms of time and money saved if he should othci- 
wise fail At the same time it should encourage those who possess 
the necessary ability to make the sacufices which may be involved 
The net results should be a better adjusted individual and a more 
competent profession While these statements apply to all profes¬ 
sional colleges, they seem particularly pertinent to medicine because 
the period of training is long, the expense to the individual is con¬ 
siderable, and the welfare of society demands highly competent 
medical men The piesenl investigation was undertaken to throw 
some light on the problem of predicting success in this professional 
area 

Specifically, it was the purpose of this study to investigate the 
value of liberal arts grade point averages and certain aptitude test 
scores as predictive indices of success in first year medicine at the 
State University of Iowa 1 Because of the variations in grading 
standards at different institutions only those students who com¬ 
pleted all of their undergraduate woik at the University and who 
had complete records for one year of work in medicine were included 
in the study. Prior to 193B standards foi admission to the College 
of Medicine required at least two years or GO semester hours of 
work in an approved college of arts and sciences; aftei 1938 this 

'The writer wishes to express his appreciation to Dean E M. Mac- 
Ewen of the State Umveisity of Iowa College of Medicine for making 
available the basic data and to Mr C William Applegate, lcsearch assist¬ 
ant in educational personnel, for his contribution to the statistical analyses 
made In the study 


77 



EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 

W3S chflii^sd to tin go yciirs 01 90 tiCincslci Iwilti AEo u\ 1938 the 
required grade point average in libel a) arts work was raised fiom 
2,00 2 to 2.20 The restrictions imposed made Lt necessary to select 
students from entering classes as fax back as 1934. The nunibet and 
percentages selected fiom each class arc presented m Table I. 


TABLE I 

Numboi and Peiccntage of Students Selected fiom Vanous 
Fresh man Classes m the College of Mediuiie 


Yeai 

1 

Total Enrolled* | 

No Used 

! Pci rentage 

T9 34 ” 

1 

1 

113 

34 

30 OB 

1935 

1 

121 

40 

33 05 

1936 

* | 

112 

21 

18 75 

1937 


104 

31 

Jl 73 

193B 

1 

55 

1 

14 

25 45 

Total 


' ' 1 

505 

142 

2B 12 


*The total number enrolled in each class unhides students completing 
their libera! arts walk at other institutions in whole oi in p<ut, those 
registered as freshmen for moie than one year, and those who withdrew 
in the course of the yeai. 

The predictive indices available for this group of 142 btudents 
included Iowa Qualifying Examination scoies, Moss Medical Apti¬ 
tude Test scores, and grade point averages for liberal tuts woik, 
The Iowa Qualifying Examination, administered to all entering 
freshmen, consists of the Iowa High School Content Examination, 
Iowa Silent Reading Test, Iowa Mathematics Aptitude Test, and 
the English Training Examination A composite scolc, consisting 
of a weighted raw score total, is computed for the gioup of four 
examinations and is used as the score in tile Iowa Qualifying Ex¬ 
amination 

The purpose of this examination is to assist counselors in then 
advisory work with students and to predict the scholastic success 
oE undergraduate students in various colleges and curricula, The 
Moss Medical Aptitude Test is administered to applicants for ad¬ 
mission to colleges of medicine by the Ameiican Association of 
Medical Colleges In considering the liberal arts woik, the total 
grade point average, the “required science" and “total science" 
grade point averages were studied separately Required science, as 
distinguished from total science, includes 32 hours of prescribed 
courses, The specific subjects prescribed in the liberal aits curncii- 

m pemt averages or point hour latios am computed by comudei* 

mg A = 4. B = 3, C = 2, D = 1, Fd = 0 


78 



PREDICTION OF SUCCESS IN A COLLEGE OF MEDICINE 


Ium aie inorganic chemistiy through qualitative analysis, quantita¬ 
tive analysis, elementary oiganic chemistry, elementary physics, and 
biological science, usually zoology, 

The cuterion of success in first-year medicine consisted of the 
student’s giade point average at the close of the academic year 
Makeups for subject conditions and incomplete 1 ! were disregarded 
It was felt that the grade first assigned in a course should be used 
because it represented a better appraisal of the student’s perform¬ 
ance in comparison with his fellows, The same practice was fol¬ 
lowed in computing grade point averages fot the second year of 
work 

The law scores in the aptitude tests had been converted to per¬ 
centiles and weie thus lecorded It was assumed that these percen¬ 
tiles weie equivalent fiom year to yeai Foi computational purposes 
the percentiles were converted into Imeai scoies by the use of 
Hull's table 


Student Perfoimancc 

The pcrfounance of the students in the aptitude tests and libeial 
aits woik is shown in Table II The mean lmcai score of 44 50 m 
the Moss Medical Aptitude Test is equivalent to a percentile scoic 
of about 40 on nation-wide noims Data weie also available for 240 
additional students who did not meet all of the criteria used ill the 
selection of the 142 students included in this study, It will be noted 
that the mean linear scoie for this group is slightly higher, but the 
range is almost identical In the Iowa Qualifying Examination and 
its sub-tests, the group is definitely superior as indicated by the 
mean linear scores, but the range Is very wide, vaiying from the 
seventh to the ninety-ninth percentile in the composite score, The 
mean grade point average of these students in liberal arts work is 
also definitely superior The average point hour ratio for the Col¬ 
lege of Liberal Arts is about 2 20, while the students in this group 
achieved a 2 60 average From these data one might conclude that 
the typical student who goes into medicine at Iowa is definitely 
superior in the Iowa Qualifying Examination and in his liberal arts 
work, but he may be somewhat below the average in the Moss Med¬ 
ical Aptitude Test. 

The second phase of the Btudy was concerned with the relation¬ 
ship between the various predictive indices and scholastic success 
in fiist year medicine. The coefficients of correlation expressing 
these relationships were computed by the product-moment method 
and aie presented in Table III 


79 



EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 

TABLE II 

Student Performance m the Aptitude Testa 
nnd h' 

Liberal Arts Work 


Predictive Index 

N J 

~ M 

a 

Rouge 

Moss Medical Aptitude 

HI 

44 50 

15 05 

Linear Score 13 83 
Percentile 3 96 

Moss Medical Aptitude 

382* 

46 00 

1G 05 

Lmenr Score 0 87 
Percentile 0 98 

Iowa Qualifying Exam , 




Linear Score 21 9 L 

1 Composite Score 

142 

G2 29 

15 57 

Percentile 7 99 

2 High School Content 

142 

62 90 

15 85 

Linear Score 9 91 
Percentile 2-gQ 

3 Math Aptitude 

142 

62 65 

15 94 

Linear Score 21 91 
Percentile 7 99 

4 English Training 

142 

56 59 

15 65 

Linear Score 9 91 
Percentile 2 99 

5 Silent Reading 

139 

50 75 

16 28 

Linear Score 22 91 
Percentile 8 99 

Required Sc Lend* 

142 

2 62 

453 1 

1 50t 4,00 

Total Science 

142 

2 GO 

308 

1 72J 4 00 

Total Libei al Arts 

142 

2 GO 

376 ' 

1 fill 3,61 

Fresh Med PHR 

142 

2 36 

GL9 

0 53 3 84 

Fresh Med PHR 

382 

2 33 

643 

0 5J 4 00 

Fresh Med P,H R 

112 f 

2 45 

524 

1 57 ’3 84 

Soph Mod PHR 

mi 

2 16 

528 

1 03 3 63 


* A supplementary study was made of 3g2 student s* 
t Students of the gronp oE 142 who completed two successive yoais 
^The 2 20 requirement waa in effect in 1938, Previous to 1930 this hail 
been 1 5o, and was then raised to 2 00 A few students admitted m 1930 
did not enroll until 1934 Only one had a total grade pomt nvciage 
below 2 00 


TABLE III 

Correlation of the Predictive Indices with the Criterion 



N 

r 

P E r 

The Iowa Qualifying Examination 




Composite Score 

142 

09 s 

| 056 

High School Content Examination 

142 

058 

056 

Mathematics Aptitude Teat 

142 

108 

,056 

English Training Examination 

142 

025 

056 

Silent Reading Teat* 

139 

075 

056 

Moss Medical Aptitude Teat 

, 142 

,226 

054 

Moea Medical Aptitude Teafc 

382 

,316 

031 

Liberal ArU Grade Point Average^ 

Required Science 

1 

, 142 

419 , 

046 

Total Science 

142 

465 

045 

Total t<ibtre,l Arts Work 

142 

449 

045 


80 



PREDICTION OF SUCCESS IN A COLLEGE OF MEDICINE 

Inspection of Table III reveals that the Iowa Qualifying Exam¬ 
ination correlates very low with success in first year medicine, that 
the correlation between the Moss Medical Aptitude Test scores and 
the criterion is hardly significant and that the total science average 
is most closely associated with success in medicine as measured by 
first year grades Examination of the scatter-diagiams provided 
several clues which may explain certain of the correlations In the 
Iowa Qualifying Examination only 27 students received linear 
scores below 50, hence seriously restricting the range of talent of 
this group As a result a majority of the students are concentrated 
in the first and second quadrants, those in the second quadrant hav¬ 
ing received high scares in the qualifying examination but achieving 
below average in first year medicine Much the same picture is pre¬ 
sented for each of the sub-tests comprising the qualifying examina¬ 
tion The data suggest a critical linear score of 40 or 45 m the com¬ 
posite score of the qualifying examination, for only eight students 
with qualifying scores below a linear scoie of 45 succeeded in mak¬ 
ing a 2 00 average or better in first year medicine 

The scatter-diagrams of the Moss Medical Aptitude Test present 
a striking contrast to those of the Iowa Qualifying Examination. A 
significant proportion of the sLudents who score low in the test do 
very well in freshman medicine As shown in Table II, the average 
grade in first year medicine is 2 36 and in the Moss test the mean 
linear score is 44 50. A total of 29 students or slightly over 20 per 
cent scored below aveiage in the Moss test, but made grades above 
2 40 in first year medicine The student scoring lowest in the apti¬ 
tude test succeeded in making a 2 60 grade point average Poor 
performance in the Moss Medical Aptitude Test does not appear 
to indicate with a high degree of certainty that the student will do 
poorly in medicine at this institution The scatter-diagrams for the 
382 students present a similar picture 

Of the indices computed from the students’ liberal arts records, 
the total science grade point average correlates best with scholastic 
success in medicine However, there are some extreme deviates who 
reduce the magnitude of the coefficient of con elation, For example, 
one student with a 2 00 liberal arts record made a 3 35 grade point 
average in medicine while another with a 3 00 record in liberal arts 
work made only a 1 50 average in medicine In general, however, 
there is rather close agreement between the grades in the two cur¬ 
ricula, It does not appear that the liberal arts science record Jb 
superior in predictive capacity to the student's general average in 
undergraduate work, When the total science average and Moss 

81 



EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


Medical Aptitude Test scores aie combined as predictive indices, 
the multiple correlation is 494 Apparently the Moss test does not 
add greatly to the predictive capacity of the science giades. 

Supplemeiitaty evidence concerning the relation between the 
Moss Medical Aptitude Test sernes and liberal aits grade point 
averages on the one hand and success in medicine on the other is 
furnished in Table IV, With one exception the mean giade point 
average in libeial aits woik mcieases as the level of achievement m 
medicine increases This is also tiue of the mean score m the Moss 
Medical Aptitude Test but the tiend is not as pronounced It is 
interesting to note that the range of performance as regaids pre¬ 
dictive indices is about the same foi all lcvelb of achievement This 
makes the low correlations less surpusing 


TABLE IV 

M 093 Medical Aptitude Test, Mean Lineal Scares end Libeial Arts Mean 
Grade Point Aveiqges for Various Levels ol Achievement 
in Freshman Medicine 


FreiWiin 1 
Medlce 

Mphs Aptitude 

Required Silence 

Tolnt Science 1 

PHR 

N 

M 


Range 

M 


RntiEC 

M 

a 

c 

3 CD 4 00 

24 

5217 

14 93 

22 83 

2 93 

51 

3 00 4 Q0 

2 52 


2 00 4 00 

a SO 2 95 

34 

40 59 

14 61 

13 73 

2 66 

3D 

J as 3 16 

2 6 t 

34 

2 00 3 22 

2 oa 2ig 

42 

47 60 

13 06 

19 71 

2 65 

40 

1 63 3 So 

2 03 

28 

3 00 3 22 

155103 

30 

40 28 

13 22 

15 68 

2 37 

35 

1 50 3 CO 

2 n 

33 

1 72 3 07 

0 00 Mo 

12 

37 ] 7 

15 02 

19 76 

2 25 

IS 

1 75 2 75 

2 2D 

Ifi 

2 00 2 61 


Total L A Work 

M ' 

<T 

Kongo 

2 <J3 

31 1 

2 16 3GI 

2 60 

2 1 

7 II 3 13 

2 65 

22 

2 05 3 50 

1 35 

11 | 

l B1 3 31 

2 28 

1C 

2 00 2 58 


TABLE V 

Student Persistence m the College of Medicine 
at the State University of Iowa 


Year Entered 

One Year 

Two Yeats 

Three Years 

Four Yeaia 


N 

N 

N 

N 

1934 

34 

30 

29 

27 

1935 

40 

35 

32 

32 

1936 

21 

19 

17 


1937 

33 

2fi 



1933 

1 14 

i 

1 



In order to ascertain whether generalizations made concerning 
first year medicine would apply to other years, the freshman and 
sophomore grades for 112 students were correlated The resulting 
coefficient was 722, It also seemed desirable to know i£ the students 
who complete one year of work continue beyond that point, The 
results are presented in Table V and seem to warrant the conclu¬ 
sion that the persistence of students beyond the freshman year is 
very high It also appears that the first year’s work is strongly 
indicative of latev success in the medical school 


82 



PREDICTION OF SUCCESS IN A COLLEGE OF MEDICINE 


The results of the piesent study agree very well with those found 
at other institutions m this region At Minnesota 1 the correlation 
between Moss Aptitude Test scores and fieshman honor points was 
found to be 27 foi the class of 1938 and 22 foi the class of 1939 
Liberal aits grades for these same classes showed coirelations with 
freshman grades m medicine of 57 and 46 respectively In a study 
made of the classes entenng the Univeisity of Illinois 1 in 1932 and 
1933 the correlations between hbeml aits avemges and achievement 
m first yeai medicine weic found to be 49 and 41 lespectivcly 
Comparable coirelations foi the liberal aits aveiage in science weie 
57 and 42 The Moss Medical Aptitude Test was administered to 
the class entering in 1932 and corielated to the extent of 42 with 
first year medicine Not all the reports on the prediction of success 
in medicine published in the Journal of the Association of Amci- 
lean Medical Colleges are m agieement with these findings Some 
repoit the Moss test as being superior to the liberal arts giade 
point average in piedicting success in medicine while others find 
the reverse to be true Apparently all agree, howevei, that aptitude 
tests and the undergiaduate grade point averages furnish informa¬ 
tion which is valuable m selecting students foi medical colleges 

Conclusions 

The data seem to waLrant the following conclusions foi the pop¬ 
ulation included in this study oi populations which me similar 

1, Libeial aits giade point avciages aie the best picdiclive 
indices of success in first yeai medicine Requited bcience, total 
science and total libeial aits woik aie of about equal value in this 
respect 

2, The con elation between the Iowa Qualifying Examination 
scores and grades m fieshman medicine ia very low However, the 
data do suggest a cutical score which might be used by counselors 
in then advisory work with students who are interested in medicine 
as a career 

3 In this institution the Moss Medical Aptitude Test does not 
predict the student’s level of achievement with high precision, Stu¬ 
dents scoring low in the test may do very well m medicine. 

d J W Cavett, A, T Henuci, nnd S B, Lindlcy '‘Tests of Medical 
Aptitude at Minnesota " Journal of the Association of Ameiictin Medical 
Colleges, XII (Septembei, 1937), 257-68 

1 George R Moon “Study of Prcmcdital and Medical Scholastic Rec- 
oids of Students in the UmvergiLy of Illinois College of Medicine" 
Journal of the Association of American Medical Colleger, XIII (’1938) 


83 



educational and psychological measurement 

4 To predict success in medicine with greater accuracy will 
require tests which distinguish more clearly between various levels 
of ability It is also possible that the criterion of success will need 
to be defined more precisely 

5, The counselor should not rely solely upon aptitude test 
scores and grade point averages m advising students about their 
probable success in medicine Pei haps average achievement in apti¬ 
tude teats and libeial arts work plus high interest and motivation 
will insure the student's scholastic success. 


84 



GIVEN AT THE UNIVERSITY OF CHICAGO-*' 

A COMPARATIVE STUDY OF FRESHMAN WEEK TESTS 

WILLIAM M SHANNER 
Civtl Aeronautics Authority 
and 

G FREDERIC KUDER 
Sofij.if Security Board 

One of the most crucial problems that confionts the educator 
of today is that of correctly advising students as to their education 
and vocational careers In conjunction with this problem, educa¬ 
tors and psychologists have constantly worked to secure more 
reliable and accurate information for use in counseling students 
Almost every college and university now has an orientation week 
at which time all incoming students are required to take batteries 
of psychological and placement examinations, the lesultB of 
which are used m advtsing students relative to their educational 
programs, 

In September, 193B, a compiehensive batleiy of psychological 
and placement tests was administered the incoming freshman 
the relationship between these tests and succeeding academic 
class at the Umveisity of Chicago with a view toward studying 
achievement at the university Among the tests administered the 
freshman group were the sixteen sub-tests of the American Coun¬ 
cil on Education Tests for Primary Mental Abilities, Expeu- 
mental Edition, the 1938 Form, College Edition of the American 
Council on Education Psychological Examination, the College 
Entrance Examination Board's Scholastic Aptitude Examination, 
a physical sciences aptitude test, a social sciences aptitude test, 
Pressey’s Special Reading Test, Form A; Pressey's Test on Read¬ 
ing Comprehension, Form A, and a vocabulary test The physical 
sciences aptitude, social sciences aptitude, and vocabulary tests 
were locally constructed 

The first two years of the University of Chicago arc devoted 
to a program of general education, The curriculum includes four 
introductory survey courses in the following fields: biologlcol 

’'This study was made while the wnteia were with the Board of 
Examinations at the University of Chicago 


85 



EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


sciences, humanities, physical sciences, and social sciences, and a 
number of elective second-year or sequence courses in specific 
subjects The typical student takes two oE the Eoiu general 
courses during his freshman year and the two remaining courses 
during his sophomore year His progiam each year is completed 
by one or more of the other courses offered in the college, The 
general courses extend throughout the school year, and achieve¬ 
ment is measured at the close of the ycai by means of a six-hour 
comprehensive examination Attendance at a course is not re¬ 
quired. The only requirement is the successful passing of the 
comprehensive examination. Many students are advised, upon the 
basis of their performances on the fieshman week examinations, 
to attempt a comprehensive examination without taking the 
course. Since all students enteung the Uiiiveisity of Chicago 
as freshmen are required to take the foui general cour&es, educa¬ 
tional advisers are confronted with the pioblcm of selecting the 
most appropriate general courses for the educational piogram of 
the student and advising him as to whethei he needs additional 
assistance or should attempt the comprehensive examination with¬ 
out taking the course * 

By June, 1939, 501 of the freshmen enlcnng the Uiiiveisity m 
September, 1938, had taken one or moio of the comprehensive 
examinations for the four survey courses and various sequence 
courses The grades of the comprehensive examinations are re¬ 
ported in terms of derived scores having a mean of 20 and a 
standard deviation of 4 The average examination grade of each 
student was found by adding the derived scores for all his com¬ 
prehensive examinations, and dividing by the number of 
examinations 


Test Scores and Average Grades 

Table I reports the correlation between the various freshman 
week tests and average examination grades. The testing time of 
each examination is also reported The social sciences aptitude 
test has the highest correlation with average grades (.575). The 
physical sciences aptitude, the American Council Psychological 
Examination, and the College Entrance Examination Board's 

*For a comprehensive description of the organization of tho first two 
yean of the University of Chjeago, see Chauncey Samuel Boucher, and 
r J ■Brumbaugh The Chicago College Plan (Chicago- The UnlveiaLty 
of Chicago Press, June, 1940), pp xnl-413, 


86 



A STUDY OF FRESHMAN WEEK TESTS 

Scholastic Aptitude Examination have just slightly lowci corre¬ 
lations All these tests, with the exception of the CEEB Scholas¬ 
tic Aptitude, require appioximately one hour of testing time 
each, the CEEB examination requires two horns’ time 

The social sciences aptitude test is essentially a reading test 
It consists of three selections, one each drawn fiom the fields of 
economics, sociology, and political science, Each paragiapli is 
followed by a number of questions based upon an understanding 
of the materials covered, The tesl is a revision of a test given 
experimentally the pLevious year, 

The physical sciences aptitude test consists of (1) a section 
on vocabulaiy m the field, (2) questions involving the interpre¬ 
tation of mathematical formulas, and (3) a leading test containing 
chemistiy and physics selections It is the product of a process 
of analysis and revision carried out ovei a penod of yeais. 

The results of the 16 tests of the Pnmaiy Mental Abilities 
battery are reported in terms of seven composite scores, each an 
approximation to a factor Scoies for the following abilities are 
reported foi the test 

Perceptual This ability, measured by the verbal enumeration 
and identical forms tests, may be described as one's facility in 
finding detail which is significant to him or detail which he is 
seeking 

Number, This factor consists of facility with simple numer¬ 
ical work and is measured by the tests of rapid addition and 
multiplication, 

Verbal, The verbal factor manifests itself in the completion 
and in same or opposite tests It is roughly the ability to deal 
readily and quickly with verbal materials, 

Spatial The spatial factor is measured by tests requiring the 
subject to think visually of geometric foims and of objects in 
space, 

Memory This factor is one’s ability to memorize various mate¬ 
rials. One test requiring the memorization of initials with names, 
and a second test requiting the association of words with num¬ 
bers are used in measuring the ability 

Inductive Reasoning, The induction factor may be described 
as one’s ability to discover some rule or principle in various 
arrangements of material A numerical, a verbal, and a spatial test 
are used in estimating the ability 


8 ? 



EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 

Deductive Reasoning, The deductive factor may be described 
as facility in formal reasoning It is measured by tests of arith¬ 
metic problems, numbei series, and perception of mechanical 
movements 

The largest correlation between a composite primary ability 
score and average grades is lor the verbal composite (415) which 
requires 16 minutes of testing time. The remaining coi relations are 
much lowei These results are reasonable, since the composite 
scores represent specific abilities, and aveiago grades represent 
general academic proficiency, 

Test Scores and Course Grades 

Table II reports the correlation between various fieshman week 
tests and grades for the fom general courses The two largest cor¬ 
relations (.654 and ,6^8) are for the physical and social sciences 
aptitude tests with the respective geneial combes, The coirelatioiu, 
for the American Council Psychological and the CEEB Scholastic 
Aptitude Examination show no statistically significant difference 
for the biological sciences, social sciences, and physical sciences 
general course examinations However, the coi relation between the 
CEEB and humanities is significantly gi eater than between hu¬ 
manities and the American Council Psychological Examination. It 
is of interest to note the variations in the si/e of the con elation 
coefficients for the composite scores of the Primary Mental Abilities 
battery. The two largest correlations for the composite scores are 
between Deduction and grades in the physical sciences, and between 
Verbal and grades in the humanities One might very well expect 
these phenomena At the same time, humanities E,hows a cot relation 
of only .071 with the composite Spatial score. 

The degree of independence of the seven composite scores of the 
Primary Mental Abilities battery is reported in Table III, which 
gLveg the intercorrelations for the seven scores Slightly over half 
of the correlations in the table are less than .300 and two can be 
considered as zero, These small correlations suggest a considerable 
degree of independence for these scores and that they might well 
measure specific abilities, The mtercoirelations among Spatial, In¬ 
duction, and Deduction scores are all very near ,500 and thus give 
evidence of considerable dependence of scores. 

The first line of Table IV reports the multiple correlation co¬ 
efficients for the combination of the two primary ability compoaites 
having the highest validities with respect to each of the four general 


88 



A STUDY OF FRESHMAN WEEK TESTS 


courses The Veibal and Deduction scores, from tests requiring 70 
minutes, were combined for all except the humanities course, For 
this course the Veibal and Number scores, from tests requiring 35 
rmnuteB, were combined, The second line of Table IV reporta the 
multiple correlations obtained by combining all seven scores of the 
Primary Mental Abilities Tests These coefficients are not markedly 
higher than those obtained from the best two in each case 


TABLE I 

Testing Time Required for Administering Vauous Psychological and 
Placement Tests to the 193B Fi ashman Class at the University of Chicago 
and the Correlation of these Tests with Average Grades 


Teal 

Testing 
Time In 
Minutes 

Correlation 

Yiilh 

Average 

Grades 

Test 

Teatlng 
Time in 
Mlntilca 

CorrclMlon 
ml tit 

1 Average 
Qr rules 

- V 

Perception* 

20 

117 

American Council Psy¬ 
chological Examination 

56 

523 

Number* 

19 

310 

College Entrance Examine 
lion Board's QcWoIubLic 
Aptitude 

120 

542 

Verbal * 

16 

415 

Physical Scicncee Aptitude 

60 

522 

Spatial* 

33 

184 

Social Sciences Aptitude 

60 

575 

Memory’* 

33 

204 

Prcsaey's Special Reading 

60 

477 

Induction* 

46 

| 

229 

Prcugey'a Reeding Compre 
henajon 

t 

326 

Deduction* 

54 

378 

Vocabulary 

2 $_ 

486 


'* Composite scores for the Thurstone Testa for Primary Mental 
Abilities 

t Specific time limits are not given j the students aie given the time 
necessaiy to read entire reading selection and answer the questions Ap¬ 
proximately 20 minutes are icqulred. 



EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 

TABLE II 

Coefficients of Con elation between Vauoiia Psychological and Place* 
ment Tests and the Four Introductoiy General Couises at the University 
of Chicago 


---—-*—■—- ~ ~ - - -- 

- - j 

— 


"" — 


Biological 

Humani 

Physical | 

Social 

Test 

Sciences 

tics 

Sciences 

. 

Sciences 

Perception* 

DB4 

.129 

166 

| 135 

Number* 

207 

| .2G5 

272 

300 

Verbal* 

<380 

472 

J76 

435 

Spatial* 

225 

071 | 

139 | 

131 

Memory* 

145 

127 

177 

160 

Induction* 

216 

029 

247 

196 

Deduction* 

418 

,190 1 

485 

427 

American Council Psychological 





Examination 

403 

.465 

482 

569 

College Entrance, Examination 





Board’s Scholastic Aptitude 





Examination 

479 

544 

471 

577 

Physical Sciences Aptitude 

— 

— 

654 

— 

Social Sciences Aptitude 

— 

-- 

— 

648 


* Composite scores for the Thurstone Testa for Pimtary Mental 
Abilities. 


TABLE III 

Intetcorrelations for the Seven Composite Scores for the Thurstonc 
Testa for Primary Mental Abilities 



u 

OJ 

43 

i 

55 

Verbal 

H 

+J 

rt 

Pi 

W 

0 

, g 
s 

1 § 

' 6 

i i 

w 

4 

f 

Deduction j 

Perception 

237 

371 

! .392 

Q4Q 

355 

153 


Number 

250 

20-1 

108 

306 

,336 



Verbal 

21S 

156 

347 

360 


1 


Spatial 

077 

.490 

,475 


1 



Memory 

.170 

| 120 






Induction 

1 533 


90 



A STUDY OF FRESHMAN WEEK TESTS 

TABLE IV 

Multiple Con elation Coefficients between Various Combinations of the 
Composite Scores of the Primaiy Mental Abilities Tests and the Four 
Introductory General Courses 


Combination of 

Composite Scores 

Biological 

Sciences 

1 

Humani¬ 

ties 

Physical 

Sciences 

Social 

Sdcncco 

Two Best Piedicting Composite 
Scores 

'184 

490 

,529 

521 

All Seven Composite Scores 

500 

541 i 

5G1 

556 


Conclusion 

Two rather striking observations may be made on the basts of 
the results reported 

(1) Marks in the four comsos can be predicted by combining 
two fairly short primary abilities measures about as well as 
by using the one-hour American Council Psychological Ex¬ 
amination or the two-hour scholastic aptitude test of the 
College Entrance Examination Board, both of which were 
constiucted for the purpose of predicting scholarship This 
result is the more remarkable since the Pumary Abilities 
Tests were not specifically constructed for the purpose of 
predicting giades 

(2) Tests developed for the specific situation are in the present 
case more efficient prognostic measures than any other 
single test or combination of tests studied The validities 
of the aptitude tests for the physical sciences and the social 
sciences are significantly higher than the other validities 
obtained, 

These two results appear to be essentially contradictory One 
of them argues for the development of a number of relatively inde¬ 
pendent measures and the use of those which, in combination, are 
most efficient foi the prediction of any selected criterion or group 
of criteria The other seems lo indicate that tests constructed and 
revised in the light of analysis with respect to the local situation 
are most effective, at least when compared with two of the bcttei 
scholastic aptitude teats constructed for general use, This conclu¬ 
sion is valid for the tests studied in their present state of develop¬ 
ment However, it is apparent that what can be measured in com¬ 
posite teste, such as the aptitude tests in the fields of the social and 


91 



EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


physical sciences, can he measured by a number of moie specific 
and relatively independent measures The difference between the 
predictive efficiency of the primary abilities measures used as com¬ 
pared with the specific aptitude tests must be attributed to the fact 
that the former do not sample some of the attributes included in 
the latter The fact that the Primary Mental Abilities Tests in 
combination produce fairly high validities although they were not 
developed for the purpose of predicting scholarship is indicative of 
the promise in this type of measuiement As the experimental tests 
of primary abilities are perfected and expanded to include other 
abilities involved in scholastic success, it is reasonable to expect 
combinations of them to approach and equal the validities of tests 
constructed for each of a number of specific situations This devel¬ 
opment will make practical a much more efficient use of test mate¬ 
rial when a number of criteria are to be predicted 


92 



NOTE ON A SIMPLIFIED METHOD OF COMPUTING 

TEST RELIABILITY 

C J HOYT 
University of Minnesota 

Kuder and Richardson 1 have presented the theoretical back¬ 
ground as well as useful formulas for a new and improved proce¬ 
dure for estimating the coefficient of test reliability. In a later 
paper 2 they have labeled their procedure “the method of rational 
equivalence ’* Then results appear to have a number of important 
advantages over the split-half correlation method used in con¬ 
junction with the Spearman-Brown formula With the split-half 
method the obtained coefficient may be an overestimate or an 
underestimate of the actual 1 eliability With the method of 
rational equivalence the estimate derived is known to be nevei 
an overestimate 1 This fact alone ts sufficient for iccommending 
the displacement of the split-half procedure, although there aic 
other advantages, as pointed out below 

The theoretical soundness of the Kuder-Richardson derivation 
is indicated by the fact that analysis of variance techniques 
applied to this problem produce an identical formula The present 
writer's derivation, using an approach entirely diffeient from that 
used by Kuder and Richardson, will appeal elsewhere, 

The use of the formula recommended by the authors for gen¬ 
eral use requires only the same pnmary data as are ordinarily 
obtained in a careful analysis of a test Consequently, it is not 
necessary to obtain the scores on separate parts of the test, The 
possibility of obtaining varying results with different methods 
of dividing the test is also obviated The compulations involved 

3 G F Kuder and M, W Richardson “The Theory of the Estimation of 
Test Reliability" Psychometnka, II (1937), 151-60 

3 M W Richardson and G F Kuder “The Calculation of Teat Re¬ 
liability Coefficients Based on the Method of Rational Equivalence" 
Journal of Educational Psychology, XL (1939), 681-87. 

3 This statement is strictly tiue for the population used, Sampling 
errors are, of course, not eliminated For a discussion of sampling errors 
the reader is referred to Robert W Jackson, “Reliability of Mental Testa " 
British Journal of Psychology, XXIX (1939), 267-87 


93 



EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 

in computing the coefficient of leliability by the method of 
rational equivalence can be perfoimed in a few simple steps that 
do not require any special statistical knowledge 

The increasing use of the method of lational equivalence for 
the estimation of test leliability leads the wntei to dcscnbe a 
procedure which he has found to be pmticulaily efficient Al¬ 
though Kuder and Richaid&ou picsent a number of formulas 
involving various degrees of rigor, they lecommcnd then formula 
(20) foi general use, Their empuical findings and those of a num¬ 
ber of others who have been using the method indicate that the 
results obtained from their formula (20) closely approximate 
those obtained by the moie rigorous faitnulas The steps outlined 
below have therefore been developed foi a variant of the lecom- 
mended formula, 1 

1 Score the tests for the number of light answers Obtain 
the sum of these scores for all the subjects This value 
is T in formula (1) below, 

2 Square each of these scores and obtain the sum of these 
squares for all the subjects This sum is Ss m the foi- 
mula below 

3, Make a tally of the test responses to each item and 
obtain the count of the numboi coirect for each item 
The total of these counts should equal the T obtained 
in step 1, 

4, Square the count obtained for each item and obtain the 
sum of these squares for all the items Tins sum is Si 
in the formula below, 

5 Using the values obtained in the steps above, solve the 
following formula for r u , the leltabihty of the test In 
this formula, k is the numbei of subjects taking the test 
and n is the number of items in the test 


n kSs-j-Si — T(T-fk) 

r lt =- *- (1) 

n-1 kSs — T 2 


In the use and analysis of a test some of these steps will 
already have been performed, Use of the item counter of the 
International Scoring Machine will greatly facilitate step 3 If 


‘"While the present paper was in piesB, Paul L Dieasel published other 
variants of the Kuder-Richardson formulas Formula (1) above is equiv- 

on BIlt ih P 0 ST a n (4 l 0 A Dre6 ^ r , a u aper Paul L Dre83fi h "Some Remaiks 
(mob 30S40 8 ° n Rell0blhty Coeffic ' cnt ” Psyvhcmetnka, V 


94 



A METHOD OF COMPUTING TEST RELIABILITY 


a computing machine is not available, steps (2) and (4) can be 
greatly facilitated by the use of a table of squares and an or¬ 
dinary adding machine 

The use of foimula (1) will be illustrated m a particular ex¬ 
ample involving a test of 250 items administered to a group of 
33 students in the College of Phaimacy at the University of Min¬ 
nesota The values obtained m this case were as follows 

Si = 112,873 T = 4B29 Ss — 111 ,351 

r 250 33(727,351) -[- 112,873 — 4829(4829 | 33) 

“ ■“ 249 33(727,351) —(4829)= 

250 636,858 159,214,500 

— - -—--« == -'-!-q-ifi 

249 603,342 170,152,158 

Foimula (1) is algebraically equivalent to foimula (20) pre¬ 
sented by Kuder and Richardson Then formula (20) is as fol¬ 
lows ' 


n a i" — npq 

ru -—-- 

n-i a, 2 


It c / —UpKji 

n—1 <r j" 


where <n is the standard deviation of the distribution of test 
scores, pi is the proportion of students passing each item taken 
in turn, and qi is the pioportion failing that item. 

It should be remembered that this procedure lb no more ap¬ 
plicable to speed tests than is the Spearman-Brown foimula. 


95 



MEASUREMENT ABSTRACTS 


Bedell, Ralph "Scoring Weighted Multiple Keyed Testa on the 
IBM Counting Sorter’ 1 Psychometnka, V (1940), 195-201 

Tests or personal inventories with diffeiential item response 
weights may be scored by means of punch card equipment, De¬ 
tailed instructions are given for prepaiing the cards and scoring 
the forms The scoring speed u, approximately four to eight times 
that attained by manual scoring (Courtesy Psyc/iomefn/raJ 


Blakey, Robert, "A Re-Analysis of a Test of the Tlieoiy of Two 

Factors ” Psychometnka, V (1940), 121-36 

The study of WlIIuiu Brown and William Stephenson, "A 
Test of the Theory of Two Factois,” is re-analysed by means of 
the Thvjrstone multiple factor methods No tests or correlations 
are left out of the original table of correlations as is done in the 
original analysis in an attempt to validate the two-factor theory, 
Space, veibal, and perceptual speed factois similar to those found 
by Thurstone, Wright, and Garrett are identified A common 
factor of "Maturation” is postulated to account for the remain¬ 
ing communahty of the tests A fifth factor is considered to 
have no significance due to the small amount of variance which 
it contributes to the total. (Courtesy Psychometnka ) 


Blum, M, L, "A Contribution to Manual Aptitude Measurement 
m Industry the Value of Certain Dexterity Measures for the 
Selection of Workers in a Watch Factory” Journal of Ap¬ 
plied Psychology, XXIV (1940), 381-416 

Job analysis of watch assembling suggested the importance of 
the ability to make fine linger movements, the ability to handle 
tweezers, and the ability to continue to perform delicate tasks 
without increasing tension or maladjustment Three criteria of 
proficiency were established' length of employment, salary ratio, 
and foremen’s ratings Two hundred and fifty-eight women (37 
workers, 137 applicants before being hired, 84 applicants after 
being hired) were examined with the O'Connor Finger Dexterity 
and Tweezer Dexterity tests Time scores showed the highest 
prediction of the proficiency criteria, The practical value o£ crit¬ 
ical time scores on the dexterity tests was Indicated W A. 
Far vej. 


96 



MEASUREMENT ABSTRACTS 


Cattell, R B “A Culture-Free Intelligence Test” Pait I Journal 

of Educational Psychology, XXXI (194G), 161-79 

A common source of error in the Emet-Siman type af test 
arises from the influence of academic expenence and general 
cultural background Instead of sampling the “common knowl¬ 
edge” of the subject, the test emphasizes the perception of rela¬ 
tions inherent in objects and piocesses common to a wide iange 
of cultuial groups One hundred multiple choice scaled items are 
chosen from mazes, senes, classifications, progiessive matrices 
(3 types), and miiror images, The progressive matrix test con¬ 
sists of combined analogy and piogiessive senes items. Harold 
Bechtoldt, 


Dunlap, Jack W “Problems Arising from the Use of a Separate 

Answei Sheet,” Journal of Psychology, X (1940), 3-48, 

The use of a separate answer sheet has been considered in 
teims of validity and i fillability of the more conventional type 
of response Underlining, marking parentheses, maikirig sep¬ 
arate answer sheets using serial and repetitive numbering for 
choices with the answei sheets of both articulated and non- 
articulated types lead to piactically identical results Com¬ 
parisons were made in terms of means, standard deviations, ielia- 
bilities, and validity of lest results foi both fourth- and eighth- 
grade pupils The use of an aiticulated, seiial numbered answer 
sheet is recommended foi tests short enough for all answeis to be 
recorded on a single side of the sheet Harold Bechtoldt 


Hariell, Willard “A Factor Analysis of Mechanical Ability 

Tests 1 ' Psychornetrika, V (1940), 17-33. 

The interconelations Df 37 vanables, including the Minnesota 
battery of "mechanical ability” tests, the seven MacQuauic teats 
of “mechanical ability,” O’Connoi's Wiggly blocks, and the Steii- 
quist picture-matching test, were analyzed by Thurstone's cen¬ 
troid method Five factors, Peiceptual, Verbal, Youth, Manual 
Agility, and Spatial, were taken out Factors prominent in so- 
called mechanical ability tests are the Spatial and Perceptual 
ones with MacQuaine’s dotting test significantly high in the 
Manual Agility factor, Each of the factors can be measured with 
group pencil-and-papei tests (Couitesy Psychometrika.) 


Harrell, T, W, and Faubion, R W “Selection Tests for Aviation 
Mechanics,” Journal of Consulting Psychology, IV (1940), 
104-05, 

Students of the United States Army Air Corps Technical 
Schools take a basic course of Shop Mathematics, Mechanical 

97 



EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 

Drafting and Blueprint Reading* Ail Corps, Fundamentals, 
Metal Work, and Electncity besides specialising in some field’ 
Correlations of 38 tests with these five basic courses range from 
— 20 to +54. Four tests give a multiple caivelation of 72 with 
a composite basic giade A factor analysis is being made of 24 
of these variables Harold Bechtoldt 


Johnson, A P, “A Btudy of One Company’s Cuterm foi Selecting 
College Graduates” Journal of Applied Psychology, XXIV 
(1940), 253-64, 

A company had foi some yeaLS considered applications foi 
sales positions (personnel, advertising, and sales promotion) on 
the basis of an intelligence test, a vocabulary test, and ratings on 
family background, Industrie ustiess, extroversion-introversion, 
and flair for writing. The present study examines the data for BO 
applicants (41 hired, 39 rejected) and seeks to objectify the rat¬ 
ings and to establish estimates of then reliability and validity 
The combined ratings of six members of a class in industrial psy¬ 
chology showed satisfactory leliability Ratings on "writing 
flair” most markedly differentiated the lined from the rejected. 
Ratings on “family background” showed the highest correlation 
(4-0 40 ± 0 12) with smvice oi meut ratings made by five com¬ 
pany executives on 23 workers W, A Varvel 


McCloy, C H “The Measurement of Speed m Motoi Peiform- 

ance” Psycfiometnka, V (1940), 173-82, 

When the centroid method of factor analysis was applied to 
two sets of data on athletic performances, three significant fac¬ 
tors emerged strength, velocity, and dead weight, Scores on this 
speed factor were predicted by the multiple regression technique, 
the factor loadings on the speed factor being used as the criterion 
correlations, and these predicted scoies were correlated with each 
of the other variables When the original tables, augmented by 
the new speed variable, were refactoied, the computed speed fac¬ 
tor fell on the speed axis as a primary trait It is thus shown 
that it is possible to isolate and measure a facloi which appeals 
in variables under consideration only as a compound, (Courtesy 
PsycfiometnJra ) 


Palmer, C E, and Klein, H “A Table of the Double Integral of 
the Gaussian Probability Function” Child Development, XI 
(1940), 61-8 F A Kingsbury 

98 



MEASUREMENT ABSTRACTS 


Roslow, S, Wnlfeck, W H,, and Coiby, P G "Consumer and 
Opinion Reseaich Experimental Studies on the Form of the 
Question ” Journal of Applied Psychology, XXIV (1940), 
334-46 

Summaries of the lesults of eight studies on vaiying the foim 
of questions aie given Alternate forms of the questionnaire, 
successive foims one month apait, and fiee 1 espouse questions 
were among the methods used The use of steieotypcs or emo¬ 
tionally charged words pioduced significant changes in responses 
Slight changes in wording may 01 may not lesiilt m changes in 
frequencies of the response choices, The completeness and num¬ 
ber of alternatives offeied in check lists tend to influence the 
proportions for any one response, while the results fiom free- 
response questions may be definitely misleading Harold Bech- 
toldt 


Sarbin, T R, and Berdie, R F "Relation of Measured InLeicsts 

to the Allport-Vernon Study of Values” Journal of Applied 

Psychology, XXIV (1940), 287-96 

Fifty-two university students were given the Allport-Vernon 
Scale and the Stiong Vocational Interest Blank, Form M A 
modification of the pattern analysis descubed by Darley was 
applied to the Strong piofiles Occupational keys were grouped 
accoiding to the results of factor analysis studies, “A few of the 
occupational groups showing measured inteiest patterns aie char¬ 
acterized by certain profiles on the Allport-Vcinon Scale." Al¬ 
though there is considerable oveilapping beLween gioups, "it is 
possible, nevertheless, , to use the Allport-Vernon Scale to 
approximate certain occupational interest types as measured by 
Strong, Thus, a definite but limited use is demonstiated for the 
Allport-Vernon scores when it is desirable to distinguish di 
identify vocational interest types in the piofcssional, sales, or 
'uplift* occupations ” W A, Varvel 


Schultz, R S "Prellnunaiy Study of an Industrial Revision of 
the Revised Minnesota Paper Foim Board Test" Journal of 
Applied Psychology , XXIV (1940), 463-67 

The Likert-Quasha Revised Minnesota Papei Foim Board 
Test (Form A A) was furthei Levised fm industrial use in order 
to decrease the demand on veibal comprehension of the instruc¬ 
tions and to simplify the response required A preliminary study 
of this industrial levision is reported, Correlations with the 
Revised Minnesota range from -f-,71 to -| 86. Scores on the indus¬ 
trial revision tend to be significantly higher Twenty-one engi¬ 
neering students obtained a highei aveiage score than did 42 

9D 



EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


trade-school boys and 57 high-school girls Correlations with 
intelligence correspond with those found in previous studies with 
the Paper Form Board Test, W A VarvcJ 


Seder, M “The Vocational Interests of Professional Women " 

Part II Journal of Applied Psychology, XXIV (1940), 265-72 

Sixty women physicians and 69 life insurance saleswomen 
filled out both the men's foim and the women's form of the 
Strong Vocational Interest Blank Foi the 268 items common to 
the two forms, the median numbei of discrepant responses was 
18 per cent, so that the test-retest reliability is considered satis¬ 
factory The common items are as heavily or mote heavily 
weighted than items occuning oil only one blank In general 
there is substantial agieement between the weights assigned to 
the response to each item by the men’s key and by the women's 
key for the same occupation “All indications of this study are 
that diEerences between sexes in an occupation are usually less 
frequent and less impoitant than similarities” It is suggested 
that a common blank should be composed and that where sex 
differences actually appear an occupational key for each sex 
should be constructed W, A Varvel 


Thurstons, L L “Experimental Study of Simple Structure ” Psy- 
chometnka.V (1940), 153-68 

A battery of 36 tests was given to a group of high-school 
seniors The factorial analysis ieveals essentially the same pri¬ 
mary factors that were found in previous studies The test bat¬ 
tery reveals a simple stiucture (Couitesy Psychometrlka ) 


Tucker, Ledyard R “The Role of Coirelated Factois in Factor 

Analysis ” Psychometnka, V (1940), 141-52, 

The fundamental factm theorem is developed in matrix form 
for the case of correlated factors The properties of the corre¬ 
lated factor system are discussed, and some effects of sampling 
error considered The psychological meaning of correlated fac¬ 
tors is discussed, and several mechanisms by which general fac¬ 
tors may operate in the factorial system are indicated, (Courtesy 
Psychometnka,) 


Walker, Helen M “Degrees of Freedom,” Journal o/ Educational 
Psychology, XXXI (1940), 253-69 

The number of degrees of freedom is a basic concept in small 
sample theory Most textbooks omit a discussion of this topic, 
and many texts give incorrect formulae and procedures because 
of ignoring it The development starts with the freedom of move- 

100 



MEASUREMENT NEWS 


metit of a point in space nuclei certain restraining conditions and 
utilizes the repiesentation of a statistical sample by a single point 
in N-dimensional space Illusliations are presented showing how 
to determine the mimbei of degrees of freedom appropriate foi 
use in ccitain common situations, as standard eiroi of the mean, 
Chi-square test, contingency tables, paitial conelation, and anal¬ 
ysis of vailance formulae Harold Bechtoldt, 


Young, P V "The Validity of Schedules and Questionnaues,” 

Journal of Educational Sociology, XIV (1940), 22-6 

A brief summary is given of an expeument with a variety of 
questionnaues and schedules as used on tlnee consideiably homo¬ 
geneous communities Objective, quantitative data were difficult 
to obtain The data shed little light on complexities of social 
patterns or on behavioi patterns of cultural worlds in relation to 
social life and personality adjustment A review is given of some 
problems involved in the construction of such instillments and of 
circumstances when they can most advantageously be used, 
Calvin Taylor 

I H i 

MEASUREMENT NEWS N 

A Personnel Research Section has lecently been established 
in the War Department urtdei the Adjutant General The func¬ 
tion of the section is to devise and assemble procedmes fot the 
classification of military personnel Di, W, V Bingham is the 
director of this section Among the professional members of the 
staff are Dr T W Hariell, on leave fiom the Umveisity of Illi¬ 
nois, Mr W M Shanner, on leave fiom the University of Chi¬ 
cago, and Di Willis Schaefei, formerly of the University of 
Chicago This section has the technical advice of a National Re¬ 
search Council committee on tho Classification of Military Per¬ 
sonnel Members of the committee are 

Drs Walter V Bingham, Carl C Brigham, Princeton Univer¬ 
sity, Henry E Garrett, Columbia Umveisity, L J O’Rourke, 
United States Civil Service Commission, Marion W Richardson, 
United States Civil Service Commission, Carioll L, Shartle, So¬ 
cial Security Board, and L L Thurslone, Umveisity of Chicago 


As a part of the national defense program the Occupational 
Analysis Section of the United States Bureau of Employment 
Security is making job analyses of occupations in the United 

* Notes for this depaLlment should be sent to Di MW Richardson, 
United States Civil Service Commission, Washington, D C 


101 



EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 

States Army Over seven thousand analyses will be made and 
job specifications will be prepared to aid the Aimy m making its 
assignments of personnel The Army is also using the Oial Trade 
Questions which have been developed by the Employment Serv¬ 
ice, as well as the recently published "Dictionary of Occupa¬ 
tional Titles " New aptitude tests developed by the Occupational 
Analysis section are being released to both the Army and the 
Navy 

Another field of activities of the Occupational Analysis Sec¬ 
tion is that of assisting local employment offices to select rapid 
learners for defense jobs New aptitude tests are being developed 
for this task New trade tests are also being constructed for 
defense jobs requiring highly skilled workers The greatest at¬ 
tention is being given to those jobs which are important both to 
the armed forces and to the civilian defense industries 

The Occupational Analysis Section is under the supervision 
of Dr CL Shartle 

The Washington Psychometric Society was organized Novem¬ 
ber 13, 1940, with a charter membeislup of eleven The following 
officers were elected; M W, Richardson, piesident, N J Van 
Steenberg, secretary, C, R Brolyei, treasuiei It is planned to 
hold meetings once a month 

Machine methods as applied to the field of measurement 
formed the major subject of discussion at an "Educational Re¬ 
search Forum" held at the Homestead of the International Busi¬ 
ness Machines Corporation at Endicott, New York, duimg the 
week of August 26 to 31 A limited numbei of tiaiiscnpls of the 
proceedings are available to those interested Requests for tran¬ 
scripts should be sent to Mr, E C Schroedel, Manager, Institu¬ 
tional Department, International Business Machines Corpoiation, 
590 Madison Avenue, New York, New York 

The papers presented at the Foium are Listed below 
Computation of Statistical Constants * 

"The Value of the Collator in Using Prepunched Cards for 
Obtaining Moments and Product Moments." — Alan D 
MeaLham 

"The Computation of Means, Standard Deviations and Corre¬ 
lations by Use of the Tabulator When the Numbers ate Either 
Positive or Negative ’’—Jack W Dunlap 

Summary of Problems in Computation of Statistical Con¬ 
stants "—Paul S Dwyer 


102 



MEASUREMENT NEWS 


“The Design of Tabulating Piocedures in Relation to Auto¬ 
matic Error Control in Statistical Analysis ”—Charles R 
Langmuir 

“Code Numbers and Coding as Aids to Reseaich "—Herbert A 
To ops 

Classification and Prediction 

“Four Aspects of Factor Analysis A Problem foi Which Ma¬ 
chine Procediues are Needed "—Hairy H Harman 
“Use of Tabulating and Scoring Machines in Facloi Anal¬ 
ysis ”—Ledyard R Tuclcet 
“Canonicals ”—Irving Lorge 

"A Successive Approximation Solution for Prediction Prob¬ 
lems Involving a Laige Number of Variables ”—John C 
Flanagan 

“Problems of Classification of Personnel in the Aimy”— 
Truman L Kelley 

“Army Testing Problems,”—T W Harrell 
Test Construction 

“Computing Difficulty Index and Validity Index in Item Anal¬ 
ysis by IBM Machines”—John M Slalnalcei 
“Item Analysis by Test Scoring Machine Graphic Item Coun¬ 
ter.^—John C Flanagan 

“Repetitive Scoring of Inteiest and PeisonaliLy Tests in De¬ 
veloping Item Weights by an Iterative Process".—Robert T, 
Rock, Jr 

Testing Programs 

“The Integration of the Test Scoring Machine with Tabulat¬ 
ing Equipment in a System of Piogess Tests and Compie- 
hensive Examinations,”—J V McQmtty 
“Applications of Electric Accounting Machines in Reporting 
Individual and Group Results in a Testing Program"— 
Charles R Langmuir 

“The Facilitation of the Analysis and Distribution of College 
Entrance Test Data in a Statewide Testing Program,”—E, L, 
Strombeig 


In the spring of 1939, at the request of a number of school 
teachers and administrators throughout the United States, the 
American Council on Education appointed the National Commit- 


103 



EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


tee on Teacher Examinations, and authorued it to supervise and 
delegate to the Cooperative Test Service of the American Council 
the task of preparing a battery of objective tests for the examina¬ 
tion of teaching candidates The National Teacher Examinations 
were administered for the first tune in vauaus centers throughout 
the United States on March 29-30, 1940 

New editions of the Teachei Examinations ai e being prepared 
for administration in 1941, The tests cover such aieas as under¬ 
standing and use of the English language; reasoning ability; 
knowledge of contempoiary affans, general cultural information, 
understanding of professional educational points of view, goals, 
attitudes, and methods, and mastery of subject matter to be 
taught All examinations are objective, consisting of short 
answer items involving multiple choice response. In 1941 the 
National Teacher Examinations will be administered two full 
days Approximately twelve hours of testing lime are required 
for the examinations 

The dates which have been named by the National Committee 
for the administration of the Teacher E v aminations in 1941 arc 
March 14 and 15, 

The examinations have, of necessity, been limited ta intellec¬ 
tual, academic, and cultural materials Other important factois 
that determine teaching success, such as tiffining, experience, 
personality characteristics, social adaptability, and others are 
judged independently by the local authority to whom the candi¬ 
date applies 


A revised series of Coapeiative General Achievement Tests 
were introduced this fall by the Cooperative Test Service, The 
revised senes beginning with Form QR includes Test I; A Test 
of General Proficiency in the Field of Social Studies, Test II 
A Test of General Proficiency in the Field of Natural Sciences, 
and Test III' A Test of General Proficiency in the Field of 
Mathematics These general proficiency tests aie not composed 
of questions dealing with topical content of the fields Covered, 
Instead, each test is divided into two parts; the first, testing foi 
knowledge of the terms and concepts essential to an understand¬ 
ing of the area in question, the second, testing the student's abil¬ 
ity to comprehend and interpret typical materials in the fields, 


104 



PRIMARY MENTAL ABILITIES OF CHILDREN 1 


1II1XMA G 1 HURSTON 1. 

Olucngo 'IeacJurg College 

F OR MANY yea is psychologists have been accustomed to 
the pioblems of special abilities and disabilities, These 
aie, in fact, the puncipal concern of the school psychologists 
who deal with childien who cannot lead, have a blind spot foi 
numbeis, 01 do one thing lemaikably well and othci things 
poolly It seems strange with all this cxpciiencc in diffeiential 
psychology that we have clung so long Lo the piacticc of 
simiiTUUizmg a child’s mental endowment by a single index, 
such as the mental age, the intelligence quotient, the peicentilc 
lank In gcnml intelligence, and othei single average mcas- 
uies An avenge index of mental endowment should be use¬ 
ful for many educational purposes, but it should not be re¬ 
garded as moie than the aveiage of scveial tests Two chil- 
dien with the same mental age can be entucly dilfeient per¬ 
sons, as is well known Theie is nothing wiong about using 
a mental age 01 an intelligence quotient if it is undehtood 
as an average of seveial tests, The cnoi that is frequently 
made is mteipietmg it as mcasmmg sonic basic functional 
unity when it Is known to be nothing moie than a composite 
of many functional unities 

The researches on the pnmaiy mental abilities which have 
been in progress foi several yeais have had as then (list 
purposes the identification and definition of the independent 
factois of mind As the natme of the abilities became moie 

'The studies lepoitcd in tins pnpci lime been enmed mil uiulci the joint 
sponsorship of the Chicago Public School*, the Uiuveiaily qf Chicago, and the 
American Council on Education 


105 



bD L/C A1 ION AI AND PSYCUOIOGICAI MLASURLMLNl 


cleaily indicated by successive studies, a second piupose of a 
more practical nature has been involved in some of the studies 
This purpose has been to prepare a set of tests of psycho¬ 
logical significance and practicable adaptability to the school 
testing and guidance program The senes of studies will be 
summarized m this paper, the batteiy of tests soon to be 
available will be clesciibed, and some of the pioblems now 
being investigated will be discussed biielly 

Previous Studies 

The first study in this senes involved the use of 56 psy¬ 
chological examinations that weie given to a gioup of about 
250 college students That study icvealed a number of pn- 
maiy abilities, some of which weie cleaily defined by the 
configmation of test vectors whde others weie indicated by 
the configuration but less cleaily defined All of these factois 
have been studied in subsequent test battenes m which each 
primaly factoi has been icpiesented by new tests specially 
designed to feature the primniy factois in the pmest possible 
foim The object has been to construct tests m which theic 
is a heavy saturation of a primary factoi and in which other 
factois aie minimized This is the punfication of tests by 
reducing then complexity 

These latter studies of the separate abilities weie in each 
case made in the Chicago high schools—one study emphasizing 
the perceptual factoi at the Lane Technical High School, 
one study of the inductive factor at the ITyde Park High 
School, an intensive study of the memory factoi or factois 
in four high schools, and a study of numerical ability by 
Coombs in si\ high schools In each scues of tests, one factoi 
was represented by a laigc number of tests, but all factois 
were well represented. In all of these studies the same pn- 
mary abilities weie identified as had been found in the experi¬ 
ment with college students, These studies led to the publica¬ 
tion by the American Council on Education of an experimental 
batteiy of tests foL the primary mental abilities, adaptable 
for use with students of high school oi college age, 

106 



PRIMARY Ml-NIAL AUIIIURS 01 CUILDRLN 

The identification of the same pmnaiy mental abilities 
among high school students ns we had picviously found among 
college students encomaged us to look foi c!if>eienti.it/on 
among the abilities of younger chiJdicii In the Chicago Pub¬ 
lic Schools, gioup mental tests aie made of all IB, 4B, and 
8B childien in the elementaly schools and of LOB students 
in the high schools The demand foi ,i senes of tests to he 
used m the guidance piogiam fm high school aidants and 
the advisability of not making too bioad a leap m age led 
us to select an eighth-giade population foi the next study 

The Eighlh-Gt nde EApenmenl 

In view of the piupose of mvestigaUng whethei oi not 
prunaiy mental abilities could be isolated foi childicn at the 
fourteen-yeai age level, the construction of the tests consisted 
essentially in the adaptation for the youngei childien of tests 
previously used with high school students In some of the 
tests little m no alteiation was necessity, while foi othci 
tests it was consideied advisable to levisc vocahulaiy and 
other aspects of the tests to suit the youngei age level A 
numbet of new tests weic added to those selected fiom pie- 
vious expeiimental batlencs, Sixty tests constituted the final 
batteiy 

When the tests had been designed and punted, they wcic 
given in a tual form to childien in giadcs 7 A and 8A m 
seveial schools Groups of fiom 50 to 100 children in these 
two glades wcie used foi the pm pose of standaidi/mg pio- 
cedures and, especially, for setting time limits 

Fifteen Chicago elementaly schools wcie selected by Miss 
Minnie L, Fallon, Assistant Super mtcmlcnt in chaige of ele¬ 
mentary education, and by Di Gucc E Munson, Director 
of the Bureau of Child Study, as cxpcumental schools foi this 
study The tests in the mam investigation weie administcicd 
in the schools by the adjustment tcacheis These adjustment 
teachers had had special Laming m testing pioccduies with 
the Bmeau of Child Study and also had had considerable 

107 



EDUCAllONAI, AND PSYCHO IOGICAL MIiAbURLMliN L' 


expei iencc in giving psychological and educational tests Spe¬ 
cial instructions in the pioceduies foi these tests weie given 
to the adjustment tencheis, as well ns wntten instructions foi 
each day’s testing progiam 

Eleven bundled and fifty-four childicn paiticipnted m this 
study The complete batteiy of 60 tests was given m 11 one- 
hour sessions to the childicn m the 813 giatles m each of the 
15 schools The chddien enjoyed the tests and, with vciy few 
exceptions, the sustained mteiest and cffoit weie quite evident 
One thing which a psychologist might feni in such a long 
series of tests would be fluctuating movitation on the pait of 
the students Although the adjustment tcachcis admmisteied 
the tests, eveiy session was obseived hy n membci of om 
staff, and we weie highly gi a tilled by the sustained nitci est and 
effort of the pupils 

In addition to tile 60 tests wc used tince moic v.uniblcs 
chi onological age, mental age, and sex, The kutci test data 
weie available in school iccouls They wetc detet mined hy 
the Kuhlmnnn-Andcison tests which had been given picviously 
to the same childicn Theiefoic, the battery to be analyzed 
factoually contained 63 vai tables 

The total population in this study consisted of 1,154 
eighth-grade children When all the iccofds had been as¬ 
sembled, it was found that 710 of these subjects had complete 
recoids foi all of the 63 variables, We decided to base om 
corielations on this population of complete recoids lathei 
than to use the laigc population with vaiymg numbeis of cases 
foi the coirelation coefficients Foi convenience of handling 
With the tabula ting-machine methods, the law scoies weic 
transmuted into single digit scoies ftom which the Peaison 
product-moment correlation coellicients weic computed With 
63 variables theic weie 1,935 Pearson correlation coefficients, 

This table of inteicorrelations was factored to 10 factois 
by the centroid method on the tabulating machines by means 
of punched caids Successive iotations made hy the method 

108 



PRIMARY MfcNlAI Aim HU'S OF CII1IDRLN 


of extended vectors yielded an oblique factorial matnx which 
is a simple structure 

Inspection of the lotated factorial matnx showed seven 
of the factois previously indicated Mcmoiy, Induction, Vci- 
bal Comprehension, Word Fluency, Nimtbei, Space, Percep¬ 
tual Speed, and tluec less easily identifiable factois, One 
of these is nnothei Verbal factoi, one is involved in ability 
to solve pencil mazes, and one is present m the tlnee dot- 
counting tests which were used 

We have computed the inteicoir elations between the 10 
pnmary factois Oui main inteiest centejs on the seven 
primary factors that can be given inteipietation and, espe¬ 
cially, on the first six of these factois foi which the mteipre- 
tation is ratheL moie definite Among the high correlations 
we note that the Numbei factoi is coil elated with the two 
Veibal factois The Woul Fluency factoi has high con elation 
with the Veibal Comprehension factoi and with Induction 
The Rote Mcmoiy factoi seems to he independent of Lhe other 
factors These con elations aic Jiighei than the correlations 
between piimaiy factois foi adults, 

Because of the psychological inteiest in the con elations 
of the pi imaiy mental abilities, we have made a sepal ate 
analysis of the conelations foi those factois which seem to 
have reasonably certain interpretation If these six pumaiy 
mental abilities are correlated because of some general intel¬ 
lective factor, then the lank of the correlation matnx should 
be one, Upon examination, this actually piovcs to be the case 
A single factoi accounts for most of the correlations between 
the prrmaiy factors 

The single factoi loadings show that the inductive factoi 
has the highest loading and the Rote Memory factoi the lowest 
loading on the common general factor in the primary abilities 
This gcneial factoi is what we have called a second-order 
general factoi It makes its appearance not as a separate fac¬ 
toi, but as a factoi inherent in the primaries and then coiiela- 

109 



IiDUCAl IONAI AND P&VCUOI OGJCAL MhASURCMf Nl 

tions If fmthei studies of the pnmaiy mental abilities of 
children should levcal this geneial factoi, it may sustain 
Speai man’s contention that theie exists a general intellective 
factoi Instead of depending on the averages oi centioids 
of aibitiaiy test battcues foj its detejmmation, the piesent 
method should enable us to identify it uniquely 

We have not been able to find in these data a geneial 
factor that is distinct Jfiom the piimaty factors, but the sec- 
ond'Ordci geneial factoi should be of as much psychological 
Intelest as the moie ficqircntly postulated, mclepcnclent gen¬ 
eral factoi of Spcmman It would be oui judgment that 
the second-oidei geneial factoi found heie is piobably the 
general factoi which Spcaimnn has so long defended, hut we 
cannot say whether he would accept the piesent findings as 
sustaining his contentions about the geneial facLoi Wc have 
not found any occasion to debate the existence of a general 
intellective factoi The lactoual methods wc h ive been using 
aie adequate foi finding such a factoi > either as a factoi mdc 
pendent of the piimaues 01 as a factor opeiatiug through 
coirelated pumanes Wc have leported on pihmiy mental 
abilities in adults, which seem to show only low positive cor¬ 
relations except for the two veibal factois In the present 
study we have found highei conelations among the piimary 
factois foi eightli-giade childicn, It is now an mteiestnig 
question to deteimine whether the conelations among pnmary 
abilities of still youngei childicn will icvea], peihaps even 
more strongly, a second-oidei geneial factor 

Intel pi emiion of Factois 

The analysis of this batteiy of 60 tests revealed essentially 
the same set of primaly factors which had been found in 
previous factorial studies Six of the factors seemed to have 
sufficient stability for the scvcial age levels that have been 
investigated to justify an extension of the tests foi these fac¬ 
tors into practical test woik in the schools In making this 
extension we have been obliged to considci caicfully the dif¬ 
ference between research on the natme of the pnmaiy fac- 

110 



PRIMARY MINIM ARIJiriLS Of t IHI DRI N 


tois and the constmction of tests foi piactical use Several 
of the pumaiy factois aie not yet sufficiently clcai as regards 
psychological inteipietation to justify an attempt to appraise 
them gcneially among school childien The pumaiy factois 
that do seem to be cleai enough foi such pm poses aie the 
following Veibal Compichension V, Woid Fluency W, 
Numbei N, Space S, Role Mcmoiy iVI, and Induction oi Rea¬ 
soning R The factois which m scveial studies aic not yet 
sufficiently cleat foi gencial application aic the Peiceptu.il 
factoi P and the Deductive factoi D 

The Veibal factoi V is found in tests involving veibal 
compichension, foi example, tests of vocabulaiy, opposites 
and synonyms, completion tests, and various leading compic- 
hension tests 

The Word Fluency lactoi W is involved wliencvei the 
subject is asked to think of isolated wouls at a lapid inte 
II is foi this leason that we have called the factoi a Woid 
Fluency factoi It can be expecled m such tests as anagiams, 
lhymmg) and pioducmg woids with a given initial lettci, 
pielix, or suffix 

The Space factoi S is involved m any task in which the 
subject manipulates an object imagmdlly in two 01 tin ce dimen¬ 
sions The ability is involved in many mechanical tasks and 
in the undeistanding of mechanical thawings Such material 
cannot be used conveniently in tesLmg situations, so we have 
used a latge numbei of tasks which aie psychologically similar, 
such as Flags, Caids, and Figuies 

The Numbei factoi N is involved in the ability to do 
numeucal calculations rapidly and accuiately It is not depen¬ 
dent upon the reasoning factois in problem-solving, but seems 
to be lestnctecl to the simplei piocesses, such as addition and 
multiplication 

A Memory factor M lias been cleaily picsent in all test 
batteries Ihe tests foi memory which arc now being used 
depend upon the ability to meinomc quickly It is quite 

111 



EDUCA110NAI AND rSYCHOI OGICAI MI* ASURl MEN l 

possible that the Memoiy factoi will be bioken down into 
moie specific factors 

The Reasoning factoi R is involved m tasks that rcquiic 
the subject to discovei a ude or pilnciplc coveting the mate* 
i ml of the test The Lettei Senes and Lettei Gioupmg 
tests aic good examples of ihe task In all these experimental 
studies two sepmate Reasoning factois iuve been indicated 
They ate peihaps Induction and Deduction, but we have not 
succeeded in constiucting pure tests of eithei factoi The tests 
which we aie now using aic moie heavily satin a ted with the 
Inductive factoi, but foi the piesent we aic simply calling 
the ability R, Reasoning 

In piesentmg foi gencial use a differential psychological 
examination which nppinises the mental endowment of chil- 
dien, it should not be assumed that theic is anything final 
about six pnmaiy factois No one knows how many pumaiy 
mental abilities theie may be U is hoped that futuic factoiial 
studies will leveal many other impoitant piimmy abilities so 
that the mental piofilqs of sLudcnts may eventually he adc 
quate toi appiaising educational and vocational potentialities 
In such a piogiam the piesent studies aie only a staitmg 
point in substituting foi the descuption of mental endowment 
by a single intelligence index the description of mental endow¬ 
ment by a piofile of fundamental tiaits 

The Final Test Satiety 

In adapting the tests foi pincticnl use in the schools foi 
the appraisal of six pnmaiy mental abilities, we must recog 
nize that the new test piogram has foi its object the pioduc- 
tion of a profile for each child, as distinguished fioni the 
description of a child's mental endowment m teuns of a single 
intelligence index For many educational purposes it is still 
of value to appiaise a child’s mental endowment loughly by 
a single measure, but the composite natme of such single 
indices must be recognized 


112 



PRIMARY MLNIAJ AI1IL11I1S 01 ClIILDRI N 


The factoual matrix of the battci y of sixty tests was 
inspected to find the three best tests foi each of seven piimaiy 
factors In making the selection of tests foi each piimaiy fac- 
toi we consideied not only the factonal saturations of the 
tests, which aic, of couisc, the most impoitant consultation, 
but also the availability of paiallcl foi ms which may be needed 
in case the tests should come into general use Ease of ad- 
mmistiation and ease m undcistanding of Llie instructions aic 
also jmpoitant consideiatjons 

The thiee tests foi each piimaiy factni wcic punted m a 
sepaiate bookleL and the mateiial was so aii.mged that the 
thiee tests foi any fnctoi could be given easily within a 40- 
minute school penod The main pm pose of the laigei test 
battery was to deteimine whethci oi not the piimaiy factois 
could be found foi eighth-giade cinJdi cn, but the pm pose ol 
the piesent shoitci battery was to ptoducc a practical, useful 
Lest battciy and to check iLs factoual composition The se¬ 
lected tests weic edited and levised so that they could he used 
toi eithei hand-sconng 01 machme-scoung. The Woul Hu- 
cncy tests constitute an exception in that none of the tests 
now known to be satinated with this factoi seems to be suit¬ 
able foi machme-scoiing 

In oidei to check the factoual analysis at the piesent age 
level, we arranged to give the selected list of 21 tests Lo a 
second population of eighth-giade clnldicn The icsulting 
data weie factoied independently of the larger battciy of 
tests Theie were 437 subjects in this population who took 
all of the 21 tests This population was used foi a new factoi 
analysis The icsults of tins analysis dcaily confirmed the 
previous study 7he simple stiuctiue in the piesent bnllciy 
is sharp, with only one piimaiy factoi conspicuously piesent 
in each test, so that the stiuctuie could be determined by 
inspection foi clusters 

A bntteiy of 17 tests has been assembled into a senes of 
test booklets foi use in the Chicago schools An experimental 
edition of 25,000 copies has been punted, and the plan for 

113 



JJJUCA110NAL AND mCllUl 0(»1CAI Ml Abl!KhMLN L 


seem mg noims on these tests includes then admuustiatLon to 
1,000 chtlcUcn at each half-ycsu gindc level fiom glade SB 
through the semoi yeai in high school These recoids have 
been obtained dm mg the school you 1940 to 1941, The use 
of such a wide age range m stand.udi/mg the test is at fust 
thought, perhaps, lathei stiange, The cffoit was made in 
ordci to secure age norms throughout the entuc range of 
abilities found among eighth-guide childicn since the tests 
arc to become a pait of the testing piocedme foi all 8B 
children in the Chicago schools Scpaiate age nouns will be 
derived for each of the six pinruiy abilities If a single index 
of a student’s mental ability is desired, it is i ccommcnded 
that the average of his six ability scoics be used 

As soon as the nouns aie established, the tests will be 
published by the American Council on Education undci the 
title “Chicago Pi unary Mental Abilities Tests ” It is expected 
that the tests will be icady f ol distubution during the sum¬ 
mer of 1941 The noims provided with the tests will be of 
a wide enough jange to make the tests useful at the high 
school and upper giadc levels 

The complete test piogiam consists ol 17 tests, all of 
which have been ieduced to machmc-scoimg foim except the 
thiec tests for the Wrncl Mucncy factor W In the mituie 
of the case there seem to be difficulties m i educing this test to 
machine-scoring form, and hence it has been ictaincd m hand- 
scormg form It should be said, however, that the W tests 
can be scored almost as fast, if not as fast, as the tests which 
are machine-scored Since all of the tests can be hand-scoied, 
their use is not limited to schools laige enough to avail 
themselves of the scoring machine The hnnd-scojing of all 
the tests is veiy easily accomplished by the use of pciforntcd 
stencils to be provided with the tesls Hand-scoring is facili 
tated by the the use of the sconng boaid distributed by the 
Stocking Company. 

The new batteiy icpLesents six pumiuy mental abilities, 
namely, Verbal Compiehcnsion V, Space S, Numbei N, Mem- 

114 



PRIMARY MLNIAL MUtniLi) 01- 0ILILDI1LN 


01 y M, Woid Fluency W, and Reasoning R They enable the 
skilled psychologist to tabulate a piofile of six lmcaily inde¬ 
pendent scoies instead of a single mcasme, such as the 
intelligence quotient. 

Principals, tcachcis, adjustment teacheis, and school 
psychologists have expicssed their satisfaction with the piofile 
of abilities plotted foi each child Piohably tJie childien 
themselves have found the piofiles most mteiesling and have 
pio/ited most fiom an examination of then own piofiles In 
the school yeai 1941-1942, these tests will be installed as a 
pait of the educational guidance progiam in the Chicago 
schools by administering them lcgularly to 8B elemental y 
school pupils and 10B high school pupils 

Some of the fe.itmcs of the tests should be mentioned 
The tests arc so aiianged that machme-scoling and hand-scoi- 
ing tests aie dnectly comparable and will have the same 
norms The child's task does not vary with the type of scoi- 
ing, only the scold's job is changed, Anothei fcatuie is tile 
use of forc-excicisc booklets punted on yellow papci The 
time limits foi the piactice excicises aic appioximaLc When 
a test pioper is staited, the student places his white test book¬ 
let on top of his yellow piacticc booklet, and the cxammcis 
and pLoctOLS can check at a glance that eveiy child is woik- 
ing in the right place The tests piopei aic to be timed 
exactly The thice tests of each of the six abilities aic 
arianged in a booklet foi admimstiation within a 40-minute 
school peiiod It is lccommended that the successive booklets 
be given on successive school days 

Fituhei Piobfcins 

One of our puncipal leseiuch uiteicsts at the piescnl 
time is to dcteimme whether pumaiy abilities can be identi¬ 
fied in children of kmdeigaiten oi fust-gLade age A senes 
of about 50 tests is well under way, and some of them aic 
now being tried with young children If wc succeed in isolat¬ 
ing primaly abilities among these young childien, om next 
step will be to piepaie a piactical batteiy of tests for that 

115 



1LDUCAU0NM- AN]) i'bYCUOI OUK AI MI- ASURKMl-NT 


age A subsequent problem will be to make expeutnental 
studies of papei-antl-pcncil tests foi appraising the pnmaiy 
abilities of chiklicn m the mtcimediate gmdes, approximately 
at the fourth-glade level We aic faiily confident that such 
tests can be picpiucd for use in Lhe intermediate giadcs 
It is a long way in the futmc, but it is intetestmg to spec¬ 
ulate on the possibility of using the tests of the pi unary 
mental abilities as the tool with which to study fundamental 
psychological pioblcms of mental giowtli and mental inheut- 
ance Absolute scaling of the tests at the different age levels 
will make possible studies on the latcs of development of the 
sepaiate abilities at various age levels Modifiability of the 
abilities will be anothei pioblem to which we shall Intel 
turn attention 


116 



A STATISTICAL EVALUATION OF 
CLINICAL COUNSELING 1 

I' 0 Will IAMSON AND I. S HOliDIN 
University of Minnesota 

S YSTEMATIC efforts at evaluation are a relatively iccent 
development m the field of counseling The form of 
appraisal has tanged from “veibal reseaich M to simple statis¬ 
tical analysis Not all of these studies have avoided the pit- 
falls, of winch there aie many, to be found m this undertak¬ 
ing The assumptions, methods, and weaknesses involved in 
the various evaluation approaches aie sunimaii/ed in a pie- 
vious paper ( 10 ) 

The piesent paper, one of a number to he reported, sum¬ 
marizes an experiment designed to evaluate a ceitain type of 
counseling Since our conclusions aic applicable only to coun¬ 
seling based upon the philosophy and procedures employed 
at the Testing Bureau of the University of Minnesota, this 
type of clinical counseling should be deset ibed 

This clinical counseling has as its purpose assisting the 
student to choose and make pi ogress towaid educational and 
vocational objectives which will yield maximum satisfaction 
It is assumed that this end can be accomplished best by aiding 
him to set his aspirations in terms of the level of his poten¬ 
tialities Naturally his potentialities must fust be analyzed 
befoie a diagnosis of any discrepancy between aspiration and 
ability can be made and befoie assistance can be forthcoming 
fiom the counselor 

Assistance in the preparation of this mnlciml was furnished by the per 
sonnel of Work Projects AdmimsLuUion, Official Pimecf No 65-1-71-HO, Sub- 
ProjeCt No 93 


117 



EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


The case data upon which analyses and diagnoses ate made 
consist of standardised ability tests, personality and intei est 
inventories, questionnaire iccords, and non-quantified informa 
tion collected fiom the student, his associates, and his parents 
This accumulation of mfounation must be intei pi eted on the 
basis of an mtegiated pictmc of the individual piovidcd 
tin ou gh personal intei views In othei words the counselor 
deals with a unique individual lathei than with a generalized 
conception of a group of individuals 

The intei view piovides the medium thiough which coun¬ 
seling is peisonulwed and thiough which the student is assisted 
in making his decisions While the decisions that the student 
accepts are and should be his owm the counselor sometimes 
plays a peisuasive lolc in that he organizes iclevant case data 
to highlight the alternative coiuses of action fiom which the 
student chooses Once the student has made the choice, the 
eounseloi has the task of aiding him to orient himself to his 
inteicsts, aLtitudes, and abdilies, and his envuonment, home 
and family, iccication, and education foi the most successful 
achievement of the chosen objective 

A fuller description of this clinical counseling piocess has 
been presented clsewheie (9, li) Only by means of an 
accuiate conception of what the eounseloi is doing and what 
he is tiymg to do can any evaluation of that counseling be 
meaningful Moteovei, we should not attempt to geneializc 
our conclusions to include any othei type of counseling than 
the one studied 

In attempting to evaluate this clinical counseling, we be¬ 
lieve that a ciiterion flexible enough to avoid artificial fiag- 
mentation of the individual piovides the most adequate design 
for expeiimentation Essentially such a design involves a 
judgmental companson of the individual’s adjustment status 
before and aftei counseling. This method—essentially the 
non-statistical weighting of vanables to form a composite 
estimate—was used in a previously reported study (11. chap 
IX) In this study an estimate of the degree of the student's 
cooperation was used as a means of control The control 


118 



LVAI UAIION or CIlNICAI COUN5I1ING 

lies in tile compaiison of those students who did with those 
who did not follow the counseloi’s i ccommendations 

The pioccss of making these evaluative judgments involves 
thiec phases: (a) the pielimmaiy lcview or analysis of 
the case data, (b) the follow-up intci view, (c) the case 
evaluation 

Ill the (iLSt phase of the evpciimcnt, all student cases wcic 
independently and cntically lead by two tiaincd woikcis whose 
functions weie to analyse and lccoid all infoimntion con¬ 
tained in the case foldei Any disci epancics between the anal¬ 
yses of the two jeadcis were leconcilcd or adjusted in con¬ 
ference with the staff membeis concerned with the pioject. 
The case xevieweis also compiled questions concerning the 
piesent status of the student, his adjustment to his ouginal 
pioblem oi pioblems, his adjustment to the counsel given, and 
any othei pcitinent mfoi mation, These questions wcie used 
subsequently in the follow-up intci view. 

For the second step all student cases weic called in foi 
a follow-up intci view Cases which weie incomplete because 
of insufficient inteiview contacts 01 incomplete test battery 
and which could not be leached for follow-lip mteiviewing 
on the campus weie icached cithei thiough a qucstionnauc 
01 an intei view m the home Foi those students who had 
left the University, infoimation was collected, and used, con¬ 
cerning then adjustment to then jobs and their satisfaction 
with that out-of-school adjustment The follow-up Intel view 
yielded infoimation concerning the extent of success oi failuie 
achieved by the student m solving each of his ouginal piob¬ 
lems and the extent to which the counseloi’s advice had been 
followed subsequent to the ouginal counseling interviews, The 
student’s own statement of the degree of his satisfaction with 
lus solution of the problems and with University Testing Bu- 
leau counseling and 1 ccommendations or any other intci pre- 
tations or evaluations that he made were specifically recoidecl 
The interviews did not interpret or evaluate this informa¬ 
tion The purpose of the follow-up interview was, essentially, 
to obtain the factual data on the present status of the case 

119 



hDUCAl IONAI AND PSV( JIOI OGlCAL MI ASURI MI NI 

Tiamcd evaluatois next, cuticnlly icvicwcd the ougmal 
case data and the follow-up interview icpciit to auivc at a 
judgment of the extent to which the student had adjusted the 
pioblems foi which he had oiiginnlly sought counseling The 
effectiveness of the counseling was evaluated m teims of the 
following counseling functions oi seivices. 

1, Diagnosis of the student's vocational and educational 
possibilities 

2 Advice in making appiapiiiitc choice of a vocational 
field and in sccuiing the 1 elated educational training sequences, 

3. Counseling as to iccognition of, and alleviation of, 
distuibing factois femotional, educational, economic, health) 
which may mteifeic with oi pievent the acceptance of pioper 
vocational choice ami the achievement of appiopiutc training 

4 Assistance in the discovciy and utilization of pcisonal 
lesouices in effecting an adjustment 

5 Guidance m the use of all Uiuveisily peisonncl le- 
sources in diagnosis and counseling 

The student's adjustment with icgaid to vocational choice 
and lus piogiess toivaid achieving satisfactory ti.lining foi 
that choice weie judged by means of the following cutcu.i 

1 Choices made in line with aptitudes, inlcicsts, work 
habits, personality, etc 

2 Piogiam of studies in line with these choices and the 
student’s qualifications, 

3 The student’s satisfaction with vocational choice, 

4 Piogress in achieving tiaming for objectives in teims 
of the capacity of the student to pi off t fjom such tiaming 

5 Alleviation of factois which uiteifeicd with the mak¬ 
ing of a satisfactory vocational choice and with acqimmg 
the necessaiy tiainmg, e g, paiental dominance of choice, 
inadequate study skills, etc 

In making then appiaisals the evaluatois studied the stu¬ 
dent's inteiests and aptitudes, the counselor's interview 
notes, the student's lepoited comments! and his giadc recoid 
achieved befoie and aftei counseling All of the information 
was weighed and balanced with lefcieuce to the live criteria 

120 



EVALUATION OF CLINICAI COUNSLIING 


befoie a judgment was made of the dcgiee of adjustment 
achieved by the student subsequent to counseling The degiee 
of coopeiation was independently judged in the same mannci 

The following five categones 2 foimed the scale of 
adjustment 

StUisfactoiy Adjustment —I The student is satisfied with 
his vocational adjustment at the tune of the (ollow up uitci- 
view In some cases the student's dissatisfaction will not be 
a deteirent to a lating of satisfactoiy adjustment In instances 
wheie the student’s aspuations me fai above his level of abil 
ity, he is consideied satisfactouly adjusted if he accepts the 
fact that his ambitions must be pitched at a Iowcl level 

2 In the mteiviewer’s judgment the vocational choice 
and adjustment of the student aie adequate, based upon 
aptitudes, intei ests, and subjecLive factois levealcd tluough 
mtei views 

3 Theie has been an alleviation of distiacting iactois 
which inteifeie with vocational choice and piofessional tt.lin¬ 
ing such as inadequate socialisation, mental conflicts, financial 
pioblcms, health handicaps, and any othci pioblcms 

4 Achievement in a given tiaming pi ogiam is commeimi- 
late with aptitudes and mtciesta 

Some Ptogiess Towatd Adjustment -—The student has 
not yet leached a satisfactory adjustment, accoiding to Lhe 
pieviously stated cntena, but is evidently staitcd on the load 
and may eventually teach the desired objective He may have 
come to the counseloi with a numbci of pioblcins involving 
vocational choice, classification in college classes, and social 
adjustment and peisonnl peculiaritics In the follow-up mtei- 
view it may be found, foi example, that he has succeeded m 
settling his vocational question but that he is still smuggling 
foi mastciy in legald to social adjustments 

No Change —This classification ts used for those cases in 
which the pioblems remain tile same as at the time of counsel 
ing While the passage of time will usually make a piohlem 
more senous, the designation of “slightly worse” was not ap- 

"Described and illustrated in nil cnrlici study Sco rcfciciiuj 11, chap IX 

121 



LDUCA110NAL AND PSYCHOLOGICAL MLASURLMLN T 


plied unless a choice point had actually been passed Thus 
a sophomore who has not yet made a vocational choice would 
be classed as unchanged, but a junta 1 without a vocational 
choice would be in a moie scnous position and theiefoie would 
be classified as slightly woisc Juniors should have begun 
specialization if they aic to make “normal” piogtcss towaid 
graduation 

Slightly IFoise —-This is a condition in which the solution 
of the onginal problems seems slightly more i emote and the 
factors which existed at the time of the first counseling contact 
still exist and aie accentuated 

Much JForse -—Those cases whcic the student’s problems 
aie mare seveie and the solution much moie remote 01 less 
piobable of achievement 

The judgments of the degiec of coopciation weic based 
upon the following categones, 

Followed advice wholly —The student followed the coun¬ 
selor's advice with tespcct to the most dominant 01 important 
original problems 

Followed advice tit pail —The student either partially fol¬ 
lowed the counselor's advice with respect to the chief problems 
or completely followed advice with respect to some but did 
not follow advice with respect to others 

Did no t follow advice —The student did not follow the 
counselor's advice in legavd to any of the main pioblems 

In order to deteimme the reliability of classification of cases 
according to the foiegoing two sets of categories, an “outside" 
judge was called in to make independent judgments of the 
adjustment of a landom sample of 247 cases A coefficient 
of correlation of 82 was found between the “outside” judge's 
classifications and those made by the evaluators This coeffi¬ 
cient may be interpreted as a high index of validity oi as a fan 
index of reliability, according to the reader’s own conception 
of the meaning of these two tcims. In ovu half of the cases 
where a discrepancy occrnied, the "outside" judge had made 
a higher classification than had the two onginal judges This 

122 



KVAIUAJ.ION 01- CLJNICAI COUNSLLING 

would seem to indicate that the evaluatois had not overesti¬ 
mated the effectiveness of the counseling 

The question anses as to how much influence the student’s 
subsequent academic achievement (available to the cvalu 
ators) had on the judge’s estimate of adjustment and eoopeia- 
tion The coiiclations between honoi-pomt latio achieved 
aftei counseling and judgment of adjustment wcic ,23 and 
39 lespectively, foi Gcncial College and the College of Sci¬ 
ence, Liteiatuie and the Arts ® The diltcience between these 

coefficients, of boideilme significance (D r /S K =2 0*), 

Dr 

may indicate that academic adjustment is moie closely lelated 
to judgment of total adjustment foi SLA students than foi 
Geneial College students This does not seem umeasonable, 
since SLA students aie gcneially committed to caieeis in which 
academic achievement is one of the most immediate lequisites 
foi success The correlations of honoi-point latio with judg¬ 
ment of cooperation weie of negligible magnitude 16 ± 05 
foi Geneial College students and 17 it 03 foi SLA students, 

In all, data weie collected on 987 complete student cases 
who used Unwcisity Testing Bmeau sei vices during the years 
1933-34-35 For the purposes of this study it was deemed 
desnable to analyze as homogeneous a population as possible 
without the sacnficc of too much data Foi this leason 498 
students from SLA and 195 students fiom the Gcncial College 
weie selected Classified according to then status at the time 
of counseling, in the SLA gioup were 154 pie-college cases, 
176 fieshmcn, and 168 sophomoies The Geneial College 
gioup contained 41 pi e-college cases, 125 freshmen, and 29 
sophomoies The pi e-college cases weie high school semois 
oi lecent giaduates who came to the Buiean foi counseling 
in the spung and bununci immediately pieccdmg cniollment 
in the Univeisity, 

That the gioups chosen foi study weie a fanly satisfactory 
repiesentation of the total Lange ol ability and achievement 
in the undeigiaduate classes of these colleges can be demon- 

Uleieafter designated ai SLA 

J Computcd according to Fisher's mclhod (l pp 208 10) 

123 



EDUCATIONAL ANI) PbYClIOl OGICAI MEASUREMENT 


stiated hom the ehstiibutrons of aptitude lest and high school 
pet Gentile scoies Of the SLA students, 62 6 pet cent and 78 1 
per cent fell at oi above the fiftieth percentile in aptitude and 
high school peicentile scoies, j espectively, while 82 8 pei cent 
and 65 2 pei cent of the Geneial College students fell at oi 
below the fiftieth peicentile in the same variables, Too often 
there is the tendency to assume that only low ability students 
have a desite foi or need of counseling The distributions of 
the SLA gioup would seem to lefute this assumption The 
General College population piovidcs us with the oppoitunity 
to deteimine whcthei counseling can be equally effective with 
low ability students 

Tile representativeness of om gioups in teims of SLA 
and General College freshmen was determined by a compau- 
son of high-school aveiagc giadcs transmuted into pcicentiles. 
Unfortunately, statistics on sophomoics ui these colleges weie 
not available Because of the known elimination of fieshnicn 
with lowei peicentiles, the sophoinoie population should be 
higher on the average Since om expeiimental population 
consisted of students fiom both classes, this analysis of icptc- 
sentativeness is not precise, The freshmen in our group wcic 
compared with representative SLA and General College 
samples, For SLA the combined mean foi a sample of 2,157 
fieshmcn students of the fall classes of 1933-34-35 was 65 45 
as compared with 69 59 foi our expeiimental freshman gioup 
Although this small difference of 4 14 is ldiable (C R of 
3 39), it does not lepiescnt a veiy significant one as fiu as 
the purpose of this study is concerned The comparison of the 
means of the Geneial College gioups yields sinnlai lesults. 
The combined mean of representative freshmen of the 1931 
and 1935 freshman classes was 34 00 s , foi oiu gioup it was 
40 25 Although the difference is somewhat laigei than that 
in the SLA group, it is not so reliable (C R of 2 64) 

We may conclude from these two analyses of the natinc 
of our counseled groups that they weie geneially lepicsenta- 

E From unpublished data collected by Dr Rulh Eckert of the Univcwily of 
Minnesota 


124 



EVALUATION OT CIINICAJ COUNSMINO 


tive of then total populations and of the total range of ability 
to do college work Tt is iiiteicstmg to note that the students 
who aie counseled by the Testing Bui can, conti aiy to the 
opinions of many, aie not the students of infcrioi ability, but, 
if anything, aie slightly supeuor to the gcneinl nucleigi .iduatc 
population of these two colleges. 

Results of the E\pci uneni 

Deg} ec of Coopei ation and /Jdjushnem —Picvious studies 
of the effectiveness of counseling which used similai methods 
have leported lesults in tenns of pcicentngcs "Pile propor¬ 
tions of oiu gioups who were classified as satisfactorily 01 
partially adjusted (82 8 per cent of SLA and 86 2 pei cent of 
Gcneial College) compaie favoiably with those icpoited 
in English studies Oakley (3) and Maciae (2), wo iking 
with small populations of younger students, lepoited 95 pei 
cent and 55 pci cent, lespectively, as the proportions who 
followed advice and who weic satisfied and successful in then 
occupational adjustment Rodgei (4), with a laiger popula¬ 
tion, lepoits 79 pei cent successful adjustment, Seipp (5), 
using a methodology almost identical with oms, analyzed the 
case lecoids of 100 adults diagnosed and advised by the 
Adjustment Seivicc of New Yoik. She found that 57 pci cent 
made a satisfactory adjustment subsequent to counseling, Oui 
lesults aie even moic impicssive when analyzed in terms of 
those who cooperated in following the counselor's advice In 
these teims the peicentagc of the SLA students satisfactouly 
oi paitially adjusted is 93 5 and the percentage of General 
College students is 96 3 

Oui data also indicate that the counseling was equally effec¬ 
tive, if not moie so, in gaining the cooperation of the student 
For SLA 70 9 pei cent coopeiated wholly and 20 1 per cent 
paitly, while foi Geneial College the peicentagcs were 69 7 
who coopeiated wholly, and 24 1 paitly, Viteles (7), diag¬ 
nosing and advising 75 adolescents, found that 58 pei cent 
followed advice completely and 21 per cent paitly. 

Since the SLA and General College groups diffeied so 
maikedly in college aptitude, it was interesting to determine 

125 



EDUCATIONAL AND PSYCHOI OGK'AI MEASUREMENT 

whether theie wfis nny real diffeience between the adjustment 
classifications of the two gtoups of experimental cases To 
test this hypothesis! the chi-squate test of independence 
was vised," The result (chi-squaie value of 3 48, /> >■ 05) 
indicates that theie was no difference in the adjustment 
achieved by the two gioups A similar analysis of the coop- 
eiation classifications yields a similar icsult (chi-squaie value 
of 2 51, p > 05) A fmthci analysis involving the length of 
the student's attendance in the Umvcisity was geneiaily nme- 
lated to eitheL adjustment oi coopeiation The chi-squaie 
values foi the groups weie insignificant m value 

Adjustment veistis Degiec of Coope) otiou —In addition to 
these direct analyses, we have attempted to shed light on the 
definition of the conditions which make adjustment moie 
probable as an outcome of clinical counseling Fiist and foie- 
most of these conditions is that the student coopeiate with 
his counseloi Anyone who has had intimate expci jence in 
counseling will have observed that the cultivation of a coopei- 
ative attitude usually precedes effective counseling That 
the greater piopcntion of adjusted students found among 
those students who cooperated is not accidental is cleaily indi¬ 
cated by the test results of the independence of these two 
vanables The chi-squaie values of 11 ^,62 and 47 44 foi 
SLA and Gcneial College students aic both highly significant 
(p < 01) This means that we may assume that a student 
who coopeiates with the counsels in attempting a solution 


“This statistic may be used Lo test tlie independence of distributions from a 
real or hypothetical distribution (6 chap 1) As used in this study, the 
expected distributions were based upon the proportions of the five classes of 
adjustment observed in (lie total distribution The formula for compulation 
appropriate for this type of analysis is 

((.-<■)» 1 


yielding a value which, by use of a table of chi-si]inie distributions, j a translated 
into an estimate of the probability that such n value could have been obtained 
for additional samples drawn from the same general population We shall use 
the conventional five per cent and one per cent points as our confidence limits 
These points are equivalent' to values two and three standard deviations from 
the mean. Because of the small number of cases, some of the categories were 
combined m all of the chi-9qiiare testa used Five was the smallest numbei of 
cases permitted in any one category, 



126 



1 VALUATION UJ CL1NITAI COUNSMlNfJ 


of pioblems will in all probability achieve satisfactoiy adjust- 
ment as defined above Only the Gencial College sophomores 
did not exhibit a statistically significant lelationship, The 
restriction of the 1 tinge of the valuables necessitated by the 
small mi mb ci of cases in this gioup may explain this fact 

Expectancy of Adjustment Atcuidtng to Type of Problem 
—A piev 10 us study by Williamson (8) has shown that coun¬ 
selors tend to specialize m the types of problems that they 
treat It Is important, theiefoie, to detciminc the effective¬ 
ness of counseling with lcspcct to diffeicnt types of pioblems 
Since in most cases students expeuence moie than one piob- 
lem, classification in any one categoiy, eg, vocational 
pioblcm, wall include students who may also have an educa¬ 
tional pioblem, a social pioldem, oi any other pioblcm oi 
combination of pioblems In view of this fact, if the voca¬ 
tional categoiy shows a significantly gieatei piopoition of 
adjusted students than the educational oi the emotional 
categoiy, then evidence of the dilfcicntial effectiveness of the 
counseling will have been discoveied 

An analysis of om data gives a cleat indication that the 
adjustment expectancy is not so marked fot social personal- 
emotional pioblems as foi vocational and educational types 
of pioblems While not all of the dilfcicnees aie statistically 
significant — the total General College gioup and the Gen- 
eial College freshmen showed the significant ones (chi-scjimic p 
29 84, p < 01 and chi-squaie, 32 99, p < 01)—the tiends 
are consistently in the same direction Since it has already 
been shown that coopeiation can be assumed as a counseling 
condition necessary to adjustment, it is not suipiismg to find 
that theie is a gieater expectancy of coopeiation foi voca¬ 
tional and educational pioblems than tor social-pet sonal- 
emotional ones 

Expectancy of Adjustment Accor ding to Status of Voca¬ 
tional Choice Since the counseling being evaluated m this 
experiment pnmniily educational and vocational, the types 
of changes in vocational orientation required should be of 
impoitancc for the expectancy of adjustment Four possibili- 

127 



FPUCATIONAI ANP PSYCUOIOG1CAJ MEASUREMENT 


ties weie defined (a) confiimatiou of the student's choice 
by the counseloi, (b) recommendation by the counselor of 
some choice other than the student’s, (c) lecommendation of 
a choice by the counseloi, because of the student’s indecision 
at the time of the ouginiil contact, (d) defeunent of choice 
on the coimscloi's advice at the time of oiigmal contact It 
had geneially been assumed that the counseloi is more likely 
to bung about adjustment when lie has only to confirm the 
student’s previous choice The icsults of oui experiment do 
not suppoit this assumption They indicate that it makes no 
diffeience, foi this type of counsclmg > whethci the student’s 
choice was conliimed 01 changed oi whethci he was undecided 
at the time of the fiist Intel view But m those cases wheie 
choice is deferied, the expectancy of adjustment is significantly 
less (chi-square of 28,59, p < 01) In the case of coopera 
tion, what “ought to be tine" actually is tiue As one would 
suppose, greatei cooperation is to be expected from those 
students whose vocational choices vveie conliimed (chi-squaic 
of 157, p< 01) 

Aptitude and Achievement in Relation to Adjustment and 
Coopei atiou —One might expect that ability and previous 
achievement of students who come foi counseling would be 
positively j elated to expectancy of coopeiation and adjust¬ 
ment This pioblcm was attacked by studying the aptitude 
and achievement chaiactenstics of students in each of the co- 
opeiation and adjustment categones The analysis of the 
vanance m aptitude test scoies gives evidence that this assump¬ 
tion cannot be held in teims of the ability test used 1 The 
vanance latioa weie of such a small degree that the piobabih- 
ties that they lepieseuted the same population weie gieatei 
than live in a hundied This means that low ability students 
are just as likely to be coopei ativc and adjusted as high 
ability students 

On the othei hand, the analysis gives reliable evidence 
that high school achievement is positively i elated to coopera- 

^Snedecoi’s tables of F (6 p 17+) were used lo cstunnte the probabilities 
□f getting as large u variance iptio from samples of a homogeneous population 
Here again the dye pe\ cent ami one per cent points were iftken as ihe limits 
of confidence 

128 



LVALUAtlON Or CLINICAL COUN3M1NG 


tion and adjustment (General College, F is 8 09, />< 01, 
SLA, F is 5 45, p < 01) This i elation is fiutliei emphasized 
by the finding that foi any dcgiee of adjustment, those stu¬ 
dents who coopeintcd weic, foi the most pait, those with 
pieviously highei achievement Previous college achievement 
could be analyzed validly only in i elation to coopeialion, 
since it aheady had cnteicd into the estimate of adjustment 
The iesults heie aic not conclusive While the SLA data 
yielded a significant variance latio (8 22, p < .01), the Gcn- 
eial College data did not 

Number of Interviews versus Adjustment and Coopera¬ 
tion —With lespect to SLA students, vanation in numbci of 
inteiviews indicates that the counseloi had the most interviews 
with students who wcie partially adjusted (Gencial College, 
F is 4 13, p < 05, SLA, F is 20 84, p < ,01) Thus those 
students who aie satisfactory adjusted, chaiacteiistically, 
do not lequne so many intei views Likewise, those students 
whose maladjustment is of such a natme (e g , vci y low ability) 
as to offer little probability of adjustment aie not intei viewed 
so fiequently It seems, then, that those students who piescnt 
difficult pioblems but give piomise of adjustment seek counsel¬ 
ing interviews most fiequently Foi SLA theie is slight 
evidence for a positive relationship of number of mtciviews 
with judgment of degiee of coopeiation (F is 3 07, p < 05) 
In the case of General College students, there is a negative 
relationship between adjustment and the numbei of interviews 
The students in the low adjustment categones apparently 
show a gieatei willingness or aie moic encouraged to leturn 
for finthei counseling than aic the mole satisfactory adjust¬ 
ment groups 

Time Interval versus Evaluation —The significance of the 
time elapsed between the hist counseling interview and the 
follow-up interview should be of value m indicating the 
optimum period foi an evaluation experiment Since theie is 
no leason to suppose that special selection has operated in the 
selection of the time at which the adjustment gtoups were 
studied, it is reasonable to infer that observed differences arc 


129 



MJUCAUONAI AND PSYC HOIOGICA1 MLAMJRLMl NI 


differences in time necessaiy for reaching that level of adjust 
ment While the previous analysis of SLA data indicated a 
greater numbci of interviews for the students classified in the 
paitially adjusted catcgoiy, it is evident that students m this 
category leqmrcd the shortest time to achieve then degiee of 
adjustment Analysis of the Geneial College data suppoits this 
icsult, a pet piecing one Howcvci, a more mtcipietable lesult 
is secured when the data aic analysed in teims of the mteuc- 
lation of coopeiation and adjustment The trend is m the 
direction of a shoitcr time mtcival foi students m any degiee 
of adjustment who coopeiated to a gieatei degiee with the 
counseloi The iufeiencc may be made that those students 
who coopeiated icached a given level of adjustment in a 
shorter time The diffeience aveiages a little ovci two months 
in an aveiage evaluation penod of 16 months The F value 
of 3 4 is beyond the one pel cent point 

5 ttw m a i y 

This evaluation of the clinical counseling pi noticed in the 
Testing Bureau of the Univeisity of Minnesota has attacked 
two baste pioblcms (a) What piopoitions of students weie 
aided by the Bui can’s counseling to achieve hcttci adjust¬ 
ment? (b) What conditions and chaiactcnstics of counseled 
students aic most conducive to a favoiable prognosis of sub¬ 
sequent counseling? 

In answer to the fust question, counseling was effective in 
achieving the cooperation of and m impioving the adjustment 
of over 80 per cent of the students m our groups This is 
especially significant m that the analysis and classification of 
cases were caicfully dehned and contioHcd } having been made 
by judges who had not been involved m any of the counseling 

The conditions and characteristics favorable to adjustment 
include the following 

l Coopeiation with the counselor was positively i elated 
to adjustment and those students who coopeiated leached then 
level of adjustment m a shoitei peiiod of time than those who 
did not 


130 



LVALUA110N 01 CIJNICAI COUNS] UNO 


2 Students expci lencmg educational ancl vocational piob- 
lems were moie successfully counseled than wcie those with 
dominant social-pei sonai-emolionnl pioblems 

3 Contraiy to belief, mu data indicate no dilleicnces m 
adjustment among counseling cases classified as vocational 
choice confiimed, altcied, oi undecided at the fust contact 
But, if vocational choice is dcfcned by the counscloi, the piog- 
nosis of adjustment is less favoiable 

4 Highei high school or pievious college achievement is 
positively related to cooperation and adjustment But level 
of ability, as measuied by the aptitude test used in this expen- 
ment, is not l elated 

These conclusions may be mterpieted as limitations citbci 
of the students involved m of this type of counseling In the 
case of the type of pioblcm, it is likely that a limitation of 
counseling is disclosed Counseling that is educationally and 
vocationally onented is not likely to deal so cflectivcly with 
social-peisonal-emotional pioblems On the othci hand, it 
does not seem probable that any type of counseling oi improve¬ 
ment in tieatmcnt techniques can do much foi a student with 
a veiy low achievement bnckgiouiul insofai as the types of 
adjustment involved m this evaluation expel imcnt aie 
concerned. 

Cettain relations of an ambiguous natuie and therefore 
demanding fuithci study wcjic observed Theic was evidence 
that the counseloi conducts moie inteiviews with students who 
aie judged as partially adjusted, yet this same gioup reached 
their level of adjustment within a shoitei pciiocl of time 
Oiu data do not indicate whethei oi not the counseloi tends 
to intensify his woik with ccitain students by conducting many 
intei views within a shoit peiiod of time 

There is one conclusion that this study should have made 
clear The evaluation of counseling is not a casual piocess, 
easily earned out Indeed, such a study icpjcsents a combina¬ 
tion of caieful nutl ligoious case leading, many days and weeks 
of interviewing, prolonged cleiical nnd statistical laboi, and 
above all a peuod o/ patient waiting for the counseling cases 

131 



J DUCATIONAI AND I’SYCIIOJ OGJCAL MI ASURLMLNl 


to mature to tlic stage wherein adequate data ate available 
foi ciitJcal evaluation 


REFERENCES 

1 Fisher, 11 A S/rt/isf urn/ Methods foi Reseat eh IPoikets (7th 
cd ) London Olivci and Boyd, 1938 356 pages 

2 MacR.ic, A "A Follow-up of Vocationally Advised Cases,” 
Journal of ihc National Institute of Indusiual Psychology, 
V (1931), 242-47 

3 Oakley, C. A "A Fust Follow-up of Scottish Vocationally 
Advised Cases ,” Human Facto) (London), XI (1937), 27-31, 

4 Rodgei, T, A "A Follow-lip of Vocationally Advised Cases, 11 
Human Fact at (London)* XI (1937), 16-26 

5 Scipp, Emma A Study of One Hundred Cheats of the dd' 
juslmenl Scivice (Adjustment Service Senes, Rcpoit XI), 
New Yoik Ameilean Association foi Adult Education* 1935 
30 pages 

6 Sncdccoi, G W Statistical Methods Ames Collegiate Press, 
Inc, 1937 341 pages 

7 Vi teles, M S "Validating the Clmiciil Method in Vocational 
Guidance/ 1 Psychological Chine, XVIII ( 1929 ), 69-77 

8 Williamson, E G “Faculty Counseling nt Minnesota, An 
Evaluation Study of Social Case Work Methods/’ Occupa 
lions, XIV (1936), 426-33 

9 Williamson, E G How to Counsel Students New York' 
McGiaw-Hdl, 1939 561 pages 

10 Williamson, E G and Borclin, E S "Evaluation of Voca¬ 
tional and Educational Counseling A Cntiquc of the Meth¬ 
odology of Experiments/ 1 Educational and Psychological 
Mcosui ement, I (1941), 5-24, 

11 Williamson, E. G and Dailey, J G Student Pci sound JFoik, 
NewYoik McGraw-Hill, 1937, 313 pages 


132 



CONTRIBUTION OF TESTS TO RESEARCH IN THE 
FIELD OF STUDENT PERSONNEL WORK 


RALPH W TYLER 
Umveisily of Clucngo 

T HE USE of tests is fundamental to many aspects of 
student peisonnel woik, In the selection of students, in 
identifying then potentialities, then pioblems, and then diffi¬ 
culties, in checking on the effectiveness of pioccduies used 
in providing foL pcisonal development, m vocational place¬ 
ment and follow-up, peisonnel woikcis have learned Lo use 
a wide lange of tests and to depend on the lesults of Lests 
as a basic pait of the peisonnel piogiam Although the place 
of tests m the piactice of peisonnel walk has been well out¬ 
lined, the contiibution to be expected fioin tests in connection 
with peisonnel leseaich has not been so cleaily indicated I 
am diffeientiating leseaich fiom piactice in the field of student 
peisonnel woik by defining leseaich as the pioccss by which 
basic facts, theones, pi maples, mstiuments, mid pioceduies 
are developed, thus piovidmg a lational fiamewoik upon 
which the piactice of student peisonnel woik can be undei- 
stood and elaboiated This distinction may peihaps be made 
cleaiet by illustiation 

It is a common piaclice of the student peisonnel oflicci 
to administei leading tests Lo incoming ficshmen, Lo study the 
lesults, to identify ceitarn students who leceivcd iclatively 
low scoies on the leading tests, and to lecommcnd a icincdial 
progiam in reading foi some of these students Rcseaich 
which finds out what leading demands aie likely to be made 
by the various fieshmen courses, which devises valid mstiu- 

133 



J'DUCATIONAT AND PSY( IIOIOGICAI MFASintEMl-NL 


ments foi measutiiig these leading abilities, which estimates 
the probable frequency of inadequate leading abilities among 
freshmen, which develops theoiy and principles ugardmg the 
jelalion of icadihg development to other aspects of the 
student’s development, and which establishes the piobable 
validity of vanous types of remedial leading piocediues 
would repiesent the essential framewoik upon which impioved 
peisonnel piactices l elating to leading can be built PLactice 
and leseaich aie complements in a sound professional giowth 

Because student personnel woik may be concerned with 
all aspects of the personal and social development of students, 
its pioblcms iclate to many pieviously 01 ganued fields of 
jescaich such as physiology, psychology, sociology, anthro¬ 
pology, psychiatiy, and education Obviously, peisonnel work- 
eis have diawn and must thaw upon these vauous oigam/cd 
fields of leseaich foi many of then concepts, instillments, and 
piactices Howevei, many pioblcms which the student pet- 
sonnel woikei faces cut acioss two 01 moie of these fields 
and are likely to involve research aspects not adequately 
investigated by any one of these disciplines alone The piob- 
lems which do involve two 01 moie organized disciplines aic 
the pioblems which in gcneial must be attacked by reseatch 
woikeis in the field of student peisonnel May l indicate 
some of these pioblems and suggest contiibutions which tests 
have made or can make to leseaich on these piohlems? 

One major leseaich problem is to delineate cleaily desn 
able goals foi a student peisonnel piogiam, Accepting the 
geneial function of peisonnel woik to be the facilitation of 
well-ioimded personal and social development of students, it 
is evident that this function must be defined moie cleaily m 
the case of a given college or type of college so at. to indicate 
the aspects of development to be ptomoted and the desired 
i elation among these various aspects This clearei pictuie of 
the phases of student development to be given attention and 
then relation is essential to the intelligent direction of n pro- 
gum aimed at facilitating well-iounded development of the 
individual student 


134 



TJISIS IN SLUDFNI PLKSONNDI WORK 


It is obvious that a piofession should have Its goals clearly 
and definitely in mind; it is not so obvious that the foimula- 
tion of these goals foi the field of student peisonnel is a 
leseaich pioblem of consideiable magnitude The difficulty 
of the task is paLtly clue to the complexity of human develop¬ 
ment Wcll-ioundecl pcisonal and social development includes 
physiological, psychological, and social aspects Fm thcimoic, 
these vanous aspects aie inten elated, that is to say, physiolog¬ 
ical development influences and is influenced by psychological 
and social development Coirespondmgly, psychological 
development influences and is influenced by physiological and 
social development, and social development influences and is 
influenced by physiological and psychological development 
Hence, although lesearch m the seveial established disciplines 
helps to identify chaiactcnstics of noimal physiological 
giowth, of psychological maturation, and of social develop¬ 
ment, special leseaich of a co-otdinatcd ol integiatcd natme 
is necessaiy to establish the desirable balance among these 
seveial aspects of student development 

A second factoi which complicates the foi initiation of goals 
foi this field is the ielation of student peisonnel woik to the 
rest of the college piogiam, In oidei that a college have the 
most effective influence upon its students the vanous phases 
of the college piogiam, cuinculai and extiacui nculai, need 
to have some uiideilymg colieience, that is, they must be bound 
togethei by common puiposes, The majoi pm poses of a 
college aie educational, and the acceptable goals of student 
personnel seivices also should be at least in haimony with the 
educational puiposes of the institution, and piefeiably they 
should seive to piomote these educational puiposes In the 
actual piactice of student peisonnel woik theie is dangei that 
we shall cany on activities clay after day without carefully 
consideung then 1 elation to the primaiy aims of our institu¬ 
tion This may lead to a shoit sighted piogiam in which 
immediate goals aie attained without really piomoting the 
ultimate goals of the college It is possible, foi example, to 
woik out a plan of housing which provides foi very quick 

135 



imTCAT[ONM AND PSYCHO! OGICAI MI'ASURl MEN1 


adjustments of the students to then classmates, and yet by the 
natrne of the housing plan, cliques may be encoiuaged and 
the fundamental educational objective of learning to undei- 
stand people with vciy difietent backgiounds and lo entei 
sympathetically into the lives of pci sons veiy diAcient fiom 
ouiselves may be hindered lalhet than piomoted Oj, a social 
counseloi may feel that hci job is well done when she has 
helped to mciease the piopoition of women students who have 
legubu dates, wheieas the ultimate educational objective is 
to get a bioadei undei standing of human bchaviot including 
a sympathetic understanding of and adjustment to the opposite 
sex Continued dates with the same individuals in many cases 
may Letard the attainment of this objective rather than help 
it It seems necessary, theiefoie, to foimulatc goals for 
student personnel woik m such a way that they aie closely 
i elated to the majoi educational objectives of the institution 

This implies that the student peisonnel woikei in close 
collaboiation with othei meinbeis of the school oi college 
staft will need to examine the various types of studies which 
suggest passible goals of student development They will need 
to consider the investigations of the sociologist, the social 
anthi apologist, the social psychologist, the economist, and the 
political scientist, to identify the demands which out culture 
makes upon young people and to understand the effect of 
cultural piessiues upon the individual and his group These 
studies of the social scientists repiesent an important com¬ 
ponent fiom which goals foi student peisonnel work will be 
formulated 

But an examination of Jesuits of leseaich in the social 
sciences is not enough It is also necessary to examine studies 
of student health and investigations in the fields of physiology, 
nutrition, and psychiatiy—fm these help to clanfy the concept 
of desiiable biological development and also to indicate possi¬ 
ble deficiencies which students may be helped to overcome 

A third component of research legarding goals foi student 
personnel woik is the field of values Values need to be con¬ 
sidered carefully not only as possible student goals but also 

136 



IL.SJLS IN bl UDLNI I’l K&ONNhL WORK 


because values, individual and cultmal, condition the student's 
development in many ways The ideals which young people 
absorb fiom contact with the cultuie have a inoie potent 
influence upon student goals, student activities, and the satis¬ 
factions and disappointments of college life than is commonly 
lealized Any compichcnsivc foundation of goals foi student 
pcisonnel woik needs to considei the values which the school 
01 college may be expected to promote and the way in which 
school 01 college expellences may influence these values 

I have suggested sevcial of the stiands which need to he 
consideied m delineating goals foi student pcisonnel woik. It 
is obvious that the selection of goals to be given paiticulai 
emphasis m a paiticular college depends upon seveiai factois 
One is the college’s conception of the good life and its 
counteipait—the desuabie peison This conception will icp- 
resent not only specific items such as physical health, social 
concern, peisonal integiity, and the like, but it will also involve 
some idea of the relation of these vanons aspects At this 
point it is vciy necessaiy foi the student personnel woikei to 
have a woikable but compiehcnsivc tlicoiy of personality 
stiuctuic and function, and of peisonality development 
Because we do know that the human oiganism shows a con- 
sideiable degiee of unity in its reactions, because we do know 
that physical, social, and psychological aspects arc mten elated, 
we realize that one cannot treat each aspect of a student’s 
development in isolation fiom the otheis Some thcoiy as to 
how these aspects aie 1 elated, how they function together, 
and how they may be developed togethei is essential to pro¬ 
vide a lational basis foi pcisonnel work If the student pei- 
sonnel woikei togethei with othei membcis of the school 01 
college staff has identified moic specifically the aspects of 
human development which the school 01 college seeks to pro¬ 
mote, and if he has a comprehensive theoiy of personality 
development, it is possible to foimulate clcai yet comprehen¬ 
sive goals for his own woik and to avoid tieating a student 
as though he were a mechanical collection of specific reactions 

What contiibutions have tests made 01 can tests make to 


137 



EDUCATIONAL AND PSYCHO! OGICAL MLASUREMLN1' 


this <ue ( i of research? A test piovides a conti oiled situation 
in which certain specified types of behavioi may be studied 
and ceitam phases of this behavioi may be measuied, In the 
effort to formulate a coheient theoiy of personal and social 
development vauous types of tests must be constiucted so 
that students can leact in ways which involve the i elation of 
biological) social) and psychological phases of behavioi 
These tests may enable us to see moie clearly how these 
phases of behavioi are ielated Fm theimoi e, any college, 
after deteimining the tentative goals of its student peisonnel 
work, can employ tests to detei mine which of these goals aie 
of primary importance to its students Conscious attention 
need not be given in the college piogram to those points at 
which students aie already developing satisfaetouly That is 
to say, tests contnbute to this aiea of icseaich both in develop¬ 
ing a compichensive set of goals and m identifying the goals 
which need majoi attention in a paiticulai college at a 
paiticular tunc 

A second aiea of iesc.uch in Lhe field of student peisonnel 
woik is the testing of the fundamental bases upon which a 
student peisonnel piogiam is built, A wcll-iounded plan of 
peisonnel setvices is a recent addition to the college campus 
Most of the schemes have been based upon assumptions which 
have not been adequately tested The piinciples of oiganiza- 
tion, of admimstiation, of the selection of the staff, of the 
training of the faculty—.ill ate in need of caieful venfication 
These principles seem to the admimstiation 01 faculty of the 
given institution to be sound, but in many cases they have been 
drawn from fields and expencnces which aie not strictly 
parallel to the field of student peisonnel, and it is likely that 
some of these piinciples, aie not appiopnate as pait of the 
foundation of the progiam of student peisonnel seivices 
Research provides a check, on the validity of the basic founda¬ 
tion of the peisonnel piogram, Such icseaich involves com- 
piehensive evaluation of an entile personnel piogiam 01 of 
particular procedures 


138 



Xlit,IS IN S1UDLNI l'LKSONNLL WORK 


A comprehensive evaluation piovides evidence showing 
how far each of the impoitant objectives or goals of student 
peisonnel woik is being attained Since these goals involve 
various aspects of student development, tests of various soits 
are essential in ordei to find out the points at which students 
aie developing adequately 01 the points at which development 
is unsatisfactory Foi example, this lcseaich lequnes tests 
of physical development, of health, of peisonal-social adjust¬ 
ment, of attitudes, of mtciests, of skills, of information 
acquned, and the like It also involves a peLiodic piogiam 
of testing so as to estimate the progress being made by die 
students, and correspondingly, then late and degree of 
development Fmtheimore, an adequate reseaich piogram 
provides a follow-up of students after they have been grad¬ 
uated fiom college ancl have gone out into life These follow¬ 
ups should piobably be made from five to ten yeais after 
graduation and should rnclude the collection of data regarding 
those objectives which have most peimanent significance Tins 
probably would include evidence legal ding intellectual Intel - 
csts, health practices and attitudes, maiital adjustment, social- 
civic interests and activities, and maturity of aesthetic inter¬ 
ests Such a follow-up study piovides an impoitant type of 
data legal ding the continuing development of students and, 
theiefoie, it is a significant phase of the evaluation of the pei- 
sonnel program 

The checking of the fundamental bases upon which the 
student personnel piogiam is built is an area of lcseaich 
which has largely been dependent upon valid tests The recent 
acceleiated development of a widei lange of tests has been 
accompanied by a corresponding increase in evaluative studies 
Tests are making an important contribution to this aiea of 
research, 

A third area of research in the field of student personnel 
work which involves tests is the construction and validation 
of instruments tD facilitate the personnel piogiam Tests 
repiesent the major gioup of these instalments, Vaiious tests 
have been constructed for use m selecting students likely to 

139 



l'DUCATIONAI AND PSYCllOl OUCAI MUASUKLMLNT 


benefit from a given college piogiam Much icseaich is still 
needed in identifying impoitant chaiactei istics of young people 
winch can be used as a basis loi college selection and for plan 
ning piogiams of educational and vocational guidance Thus 
far, these tests have laigely consisted of measuies of veibal 
facility, of numencal manipulation, and of the acquisition of 
mfoimation Tests of higher intellectual skills, of interests, of 
peisonal-social adjustment, and of attitudes aie just beginning 
to contnbute markedly to the selection and guidance woik 

Tests nlieady developed have gicatly facilitated the iden¬ 
tification of students needing special attention, but many new 
mstuiments are also needed Tests ate widely available foi 
identifying certain types of reading difficulty and ceitam types 
of subject-mattei deficiency New instillments nic needed, how¬ 
ever, to measme othei types of psychological and social reac 
tion which have fundamental significance foi success m college 
and in life, and which should be identified eaily enough so that 
a piogiam to facilitate development may be begun 

A simiki condition exists with icgai d to tests useful in 
the vocational placement of students Tests of some of the 
essential vocational skills have been of gieat value Tests foi 
identifying ceitam vocational intcicsts aic showing pionnsc 
Howevei, some of the fundamental vocational attitudes, 
habits, and ways of thinking have not been cleatly identified, 
noi have satisfactoLy tests Iol them yet been developed The 
futiue contributions of tests of this type aie likely to he laigc 

New tests aie being constmcted to help m evaluating pei- 
sonnel piogiams and piocethues Judgments of students and 
faculty have not only been supplemented by more caieful case 
studies and obseivational iccoids, but tests of attitudes, of 
intei cats, of habits and piactices, of mfoimation and ski Lis aic 
becoming available foi a moie compiehensive evaluation 
Additional tests are still needed, and many me m the process 
of construction 

I have attempted to suggest bnefly the place of tests m 
thiee aieas of teseaich, namely, in delineating goals foi 
student personnel work, in checking the fundamental bases 

140 



ll.SIS IN S'! UDPN1 H'RSONNIT WORK 


upon which peisonnel piogums and pioceduies ate developed, 
and in constructing essential instalments foi peisonnel work 
Tests have alieady made an impoitnnt contribution to these 
three aieas of lcsearch, but the futuie coiuributions should be 
fai greatei than those of the past The limitations of the 
contributions of the past seem to me to have been due to so¬ 
cial facto is which now can he laigely overcome 

In the fust place, student personnel woik ougmated 
largely from specific maladjustments within the tiaditional 
college piogiam Paiticulai pioblems 1 elating to the conduct 
andmoials of students, then social life, 01 then housing led 
to the piovision of special staff membeis to non out these 
difficulties Only within icccnt yeais has theie been wide lecog- 
nition of thebtoad implications of student peisonnel work and 
of the need foi some coheient philosophy and piogiam 
Natmally, tests used in the student peisonnel field fiequently 
were taken ovei, as they weie developed, foi othei pm poses 
and used without consideration of the behavioi patterns which 
these tests implied It seems to me that we aic now icndy to 
foinuilate a coheient conception of peisonal and social adjust¬ 
ment and to examine possible tests m the light of om concept, 
discaiding 01 modifying tests which do not appiopilately lit 
this concept and developing new tests that aic in huimony 
with it 

With this bit-by-bit accumulation of peisonnel icsponsibili- 
ties in the college piogiam, it was natmal that the tests used 
should laigely be built upon a type of atomistic concept of 
human behavioi, and that the test lesults should be sum- 
mauzed as single scenes 01 as sepaiate paits added togethei 
to foim a total scoie In lecent yeais we have seen moie 
cleaily how to constuict tests involving gieater organization 
of behavioL and how to summanze lesults in tcims of desciip- 
tive scoics 1 elating vaiious paits of a test, thus getting .1 
moie coheient pictiue of the student’s response This elimi¬ 
nation of the single composite scoic is an impoitant step m 
increasing the contribution tests make to the field of student 
personnel woik 


141 



EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


An additional reason foi my belief that tests will make an 
increasing contribution to the field of student peisonnel work 
is the wide lecognition of a bioad definition of tests No 
longer aie tests conceived only as papei-and-peneil examma 
tions. Tests aie increasingly consideied as contiolled methods 
foi obtaining a sample of a student’s icactions under ceitain 
specified conditions With this recognition that a test is a 
means of sampling ceitain aspects of human hehavioi, atten. 
tion is now being focused upon clearei definitions of those 
aspects ot human behavior which need to be sampled by means 
of tests In educational testing twenty yeais ago, piimaiy 
attention was given to sampling the content of textbooks which 
students weie expected to remembei and to sampling cel tain of 
the subject-mattei skills, such as wilting or numerical com¬ 
putation It is now lccogmzcd that othei aspects of behavioi 
ate important, such as the way m which the student attacks 
pioblems, the types of mteicsts he is developing, the attitudes 
he has, his 1 espouse to aesthetic expenences such as liteiatuic, 
music, and the aits 

With gieatei clarification of the natiue of testing has come 
a better specification of the behavior to be tested Twenty 
yeais ago a test m chemistry would be built by specifying the 
topics, that is, the content to be sampled No conscious effort 
was made to specify the type of icaction the student might 
be expected to make to this content Now we lccogmze that 
we must specify not only the content but also the kind of Leac- 
tion expected of the student, the soit of situation m which such 
reaction can be expected and, if possible, the kind of puipose 
which a student would have when i eacting By specifying these 
four aspects of behavior we have a much clearer idea of wh.U 
we are tiying to test, and this increases the probability that 
we shall control the testing situation sufficiently to piovide 
a satisfactory test 


142 



GRADE AND AGE NORMS FOR THE MINNESOTA 
VOCATIONAL TEST FOR CLERICAL WORKERS' 


GWENDOLEN G SCIINl'lDICK 
UniYCLSity of Minnesota 


P ROGRESS in the applications of psychology, especially in 
the field of aptitude measurement, will be made by work¬ 
ing intensively on the measuiing instalments which wc aheady 
have, iathcL than by adding to the laige iiumbci of devices 
about which we have insufficient research to justify scientific 
application In line with this belief wc have investigated cei- 
tam pioblcms connected with the Minnesota Vocational Test 
foi Ciencnl Woikcrs The portion of this lescaich to be 
repoited heie deals with a noimative study of this test 
The usefulness of the Minnesota Vocational Test for 
Clencal Woikcis has been sejiously cm tailed because of the 
fact that norms have been established only for adults in the 
general population and foi employed adult clencal woikers 
(6) This limitation is the levcisc of the more usual and 
senous one where noims exist foi school populations while 
no adequate nouns exist foi adults and for cuteiion gioups 
The problem of appropriate norms foi tests is one of the most 
urgent ones which counselors face in applying measuring 
instruments in guidance programs whcie individual analysis 
and diagnosis is an mdispensable fust step The Minnesota 


J The coopciiuion of many pardons has been necessary for the completion 
or this study anti the aiuhoi wishes hcicby to express her appreciation. Professor 
Donald G Paterson directed the constiijciicm of Ous mcnfluimg jnstriimem by 
Dr Dorothy M Andrew and has followed Ihrough with advice and helpful 
suggestions m subsequent research Assistance in the pieparation af some of the 
materials for this study was fuirushed by the pcisonnel of Woifc Projects 
Administration, Oflicnl Project Numbd 665-71-3 69, Subproject Number 229 

143 



IDUCAHONAT AND l'SYC IIOIOGICAT MI-ASURI MfNr 


Vocational Test for Clencal WorkeLs was standaulizcd on 
adults, and excellent noims weie developed and icpoited in 
the Bulletins of the Employment Stabilization Rescaich Insti¬ 
tute (5, 9) and in the test manual (6) The test with its 
noims foi adults was used to advantage by the Adjustment 
Seivicc in New Yoik City, a community guidance agency foi 
adults, and by othei agencies concerned with the counseling 
of adults As the test has become moie widely adopted, how¬ 
ever, it has been applied in many situations, especially to youth 
populations foi which the significance of test scoies was not 
known Some woikeis have devised local noims which have 
icfleeted selective factors of sampling Limitations in inteipie- 
tation have necessanly accompanied limitations in the selection 
of the sample What has been needed is a noimative study 
of this test based upon a laige sample of youth lepiesentativc 
of the populations at the junioi and senior high school levels 
With such icseaith it becomes possible to apply the Minnesota 
Vocational Test foi Cleiieal Woikeis to the age innge foi 
which the test is most appiopnate fiom the standpoint of 
educational and vocational guidance, 

, The Minnesota Vocational Test foi Clenca] Woikeis is 
composed of two subtests Test I consists of 200 paned num¬ 
bers varying m length fiom thiee to 12 digits. Test II con¬ 
sists of 200 paned names vaiying in length fiom seven to 
16 letteis Slight changes had been made m half ot the paned 
items and the subject is asked to compaic the paired items 
as lapidly as possible, checking those pans which aie identical 
He is allowed eight minutes foi Test I and seven minutes foi 
Test II Scoies aie calculated on each of the two subtests 
using the “light minus wiong" foimula, The administiation 
of the test is descnbed in the manual and othei somces 
(<5, 4,9) 

Ihe readei mteiested in Lescarch evidence of the test’s 
reliability and validity, and in information legal ding adult 
norms and the lelationship between test scores and other valu¬ 
ables is refeired to the icferences on page 15 6 and especially 

144 



NORMS FOR MINNLbOI'A Cll-RICAL I LSI 


to the monogiaph by Andiew and Paterson (5) The follow¬ 
ing paiagraphs summarize the lesearch very bnefly 

Andrew (5) has picsentcd evidence on icliability which 
indicates that the test yields sufliciently stable icsults foi use 
with individuals 

Andiew (5) has also piesentcd a considciablc body of 
evidence which points to the Lest as a valuable technique m a 
clinical piogiam of educational and vocational guidance oi 
selection to eliminate peisons not likely to succeed in clcuca! 
tiauiingoi employment The test lesults conelatc highly with 
high school and college teacheis’ latmgs of clencal aptitude— 
m fact, highei than does a test of geneial intelligence They 
also are definitely l elated to achievement recoLds in typing 
and to ciitena of pioduction on clcncal jobs ns well as to 
supeivisois’ latmgs of pioficicncy on the job The test appeals 
to be mensiumg factois othci than academic intelligence oi 
clencal tiainmg and experience and it is bettei than othci 
tests foi diffeientiating elcucal woikeis fjom persons in the 
general population 

The iclationship between the two subtests is not suffi¬ 
ciently high to justify using one test alone oi combining the 
two scoies (5) Reading speed is not an impoitant factoi 
in the test 

The method of sconng is that of 'Sight minus wiong’ 1 
Despite eeitain cuticisms of this technique (7) t it can be 
upheld on logical bases (10) 

Significant sex difleiences on test scores have been lepoited 
(5) for men and women in the geneial population but not 
for men and women employed in the same type of clerical 
positions 

The test authoi (1, 2, 3) has made an analysis of the tcsl 
to dcteimmc the abilities which it is measuung and has con¬ 
cluded that Test I involves a numeiical factor and Test II a 
verbal factor and that both aie lelatively uni elated to 
academic intelligence, ability to pciceive spatial relationships, 
and dexterity with fingcis and small tools 

145 



liDUCALIONAL AND PiYCUOJ OCjIL'AL MhASUKkMLNl 


To secure noims which would be fairly representative of a 
cioss-scction of jnnioi and senior high school pupils m the 
Noith Cential Association of Secondary Schools, St Paul, a 
midwcstern city of onc-quai tci of .i million population, was 
chosen Approximately 4,000 pupils in giadcs eight thiough 
twelve weie given the Minnesota Vocational Test foi Clencal 
Workers Tins does not represent the entiic population foi 
these giades. In oiclei to guard against a possible selective 
sampling in the choice of schools, an attempt was made to 
select at each giade level schools representing the uppei, 
middle, and lower socio-economic groups One high school and 
one junior high school, judged to lepiesent each of these three 
groups, were chosen To guaid fvutheL against securing a 
selective sampling within the schools, the pupils weie tested 
in English classes, since English is a subject lequited of all 
irrespective of ciuiiculum followed. Tabic 1 shows the num- 
bei of pupils included m the noims, distributed by giade, sex, 
and school Schools A and B icpicsent the abovc-avciage 
socio-economic groups, schools C and D lepicsent tlic avciage, 
and schools E and F weie charactered by a huge proportion 
of families in the lowci socio-economic gioups 

The testing proeediuc was standardized and adhered to 
thioughout the program with testing done m the legulai Eng¬ 
lish classes The administration of tile test was that pre¬ 
scribed by the test author (4, 5, 6) Personal data items 
including identifying data, date of buth, giade, school, cunic- 
nlum, and father’s occupation weie filled out by the pupils on 
the last page of the test foldei (10) before the test itself 
was administered Buth dates weie checked against school 
records Additional data, such as high school scholarship pei- 
centilc innk and intelligence test scores, 2 were collected foi 
certain pupils and recorded on the personal data sheet All 
tests were rescored at least once 

An important prerequisite to the publication of noims on 
tests which are to be used widely is a careful description of the 
population on which the norms were based Only in this way 

E The Aptitude Index of the Van Wflgehen Unit Scales of Aptitude, Fat ms 
E, D, or C, 


146 



TABLE 1 

ItJTIOS" OF C^SES BY GRADE, SEY, AND SCHOOL 


NORMS FOR MlNNLSOlA 


ti £» 
O \r\ 


Pm w 


» 3 


01 £2 

4~i on 

o 

h 


p 


, O 

Pm ^ 


_ M 

pp <<■> 


j3 a 

o 

H 


r in 

Pm ^ 


r 1 In 
U 


w 

Pm o 
w 


w § 


u a 




U N 


p S o 


. VO 

(*v CP 


u iri 

« O 



Letters designate the different schools 


I 1HJCAH0NAJ AND PiA C 110J OGICv\L Ml- ASURMUUNl 

aic test cousumcis able to detcimmc whethci 01 not the norms 
jii c appiopmte anil applicable to then local populations 
Befoie picsentmg thL tables of 1101 ms 4Ve shall, theiefoie, 
descube the sample of the school population to which we 
adimmsteied the Minnesota Vocational Test foi Clerical 
Woiheis, picscnting statistics on intellectual cliaiactcustics, 
age-gude locations, and socio-economic levels 

Table 2 desenbes a laige piopoition J of om sample at 
each giatle level m terms of the cenlinl tendency and vari¬ 
ability of intelligence lest bcoies This table also gives similai 
Jigures foi the total Si Paul school population foi these 

1'AMh 2 

comparison di Tim ms a>.s os run br paui suiooi rorin \noN and a i arct 



PQRCl tmrli or QUK 3 AMPI v OM Till* nA3F9 01 

THF UNIT SC AH 3 OF 



Ai’nrirpi! rl 

APTITUDE INDEX 

ii 









Did 






S 

D 

(rindc 

Gi mips 

N 

Menu 

S D 

Dili 

Djff 


St Paul School 
Population 

2,327 

105 3567 

11 7982 



VIII 





2 4190 

3 1366 


Om Sample 

511 

107 7757 

1 1 7785 




St Paul School 
Population 

2.1G9 

1017833 

13 73 10 



IX 

Our Sample 

561 

m 3929 

13 2920 

3 7096 

5 9 330 


St Paul School 
Population 

2,299 

10+9674 

33 3065 



X 

Our Snuipk 

716 

106 0964 

12 6935 

1 1290 

2,0610 


St Paul School 
Population 

1,850 

106 8270 

12 2660 



XI 

Our Sample 

758 

106 708 1 

12 0660 

8814 

1 6866 


St, Paul School 
Population 

1,323 

(Og 3117 

11 2010 



XII 

Qiu Sample 

1,0 IS 

107 3470 

U 9260 

— 9977 

1 4996 


♦These figures foi the St Paul school population were piovidetl through the 
courtesy of Pvofcssai M J Vim Wflgenen of ihc Umvmity of Minnesota 


“Ninety three pu cent of the total 3,90+ canes tue so ilescitbcd No mielli- 
getvee teat scoie foi ihe icrmmder could he located but there is no tcason to 
sqspect the operation of i selective factor here 

148 



NORMS FOR MJNNJISOJ A CLERICAL JLSI 


giades Theie is no necessity thaL om sample should he sUictiy 
representative of the St Paul population That would have 
been requued if we had desired to develop noirns appropnate 
only to the St Paul school population at a pmticulai dale 
Om puipose has been to develop norms on a sample judged 
to be funly typical of the school population in Noith Centi'.il 
secondaiy schools and then to desenbe that sample as 
adequately as possible 

It can he seen that out sample difteis fiom die St Paul 
school population by fiom less than one to less Lhan foul 
points on the uveiage, depending upon the giade These 
small differences aie moie significant statistically fat grades 
eight and nine than foi giades ten, eleven, and twelve, as can 
be seen from the ratios of diffeiences to the standard devia¬ 
tions of those differences Using (Jus test of representative¬ 
ness, then, we can say that oui tenth, eleventh, and twelfth 
grade students are, on the aveiagc, moic like the St Paul 
school population fiom which they weie drawn than our eighth 
and ninth giade students The diffeiences aie small, however, 
and it is not necessary for oui purpose that the sample be 
exactly equivalent to this paiticuhu population Furthermore, 
the slight differences in aveiagc scoics on an intelligence test 
would probably not significantly affect the distribution of scoics 
on the cleucal test which is not measuung intelligence to any 
great extent 

As a further descuption of oui sample, Table 3 shows 
the percentage of each age represented in each of the five 
giade gioups J The age is that at the neaiest bnthday 

A still finther descrrptioji of our sample was obtained by 
deteiminmg from the pupil’s statements the occupation of the 
fathei and then distributing these occupations according to the 
categones of the Occupational Rating Scale 8 of the University 

^Tfie reader may be interested hi noting ihc icscmblnnces bctvvecn this dis¬ 
tribution and that which Tcimnn anti Mciull used for the slaiulnulwation of 
the reused Stonford-Binet test L M Tcimnn nnd M A Merrill, hlrasuung 
hittlhgcnee (New York* Ilaughton-MiflUn, 1937), p 17 

G F!orcnce L Goodcnough and John E Anderson, Expci i mental CJuld Psy 
cftoloffy (New Yoik Appleton-Century, 1931 ), pp 501 12 

149 



l'DUCAlFONAI AND PSVCJNUOfilCAL MEASUREMENT 


TAJILE 3 

ACE GRADE DISTRIBUTION IN PFRCTNTAOKS TOR. CASES INCrUDED 

IN THIS STUDY 


Ci nde 

N 

12 




Age 




13 

1+ 

15 

IG 

17 

18 

19 

20 21 

VIII 

m 

6 

35 

+3 

10 

4 

\ 




IX 

659 


3 

39 

42 

12 

4 




X 

798 



5 

35 

+1 

16 

2 

1 


XI 

so a 




7 

36 

40 

12 

4 

1 

XII 

1,077 





5 

38 

38 

14 

3 1 


of Minnesota Institute of Child Welfaic A total of 3,347 of 
om 3,904 cases weie so classified fiom occupations as given 
hy the pupils whose fatheis weie living, employed in an mban 
community, and not on lelief Theie weie 557 cases not classi¬ 
fied, and these included some foi which the mfoimation was 
inadequate The elimination of these gLoups from the classifi¬ 
cation, therefore, tends to give a slightly dlstai ted picture of 
oui sample, weighting it ioi the uppei socio-economic levels 
Such elimination was necessaiy foi comparative purposes with 
figures available for the United States population and a similar 
mban community, Minneapolis 

Table 4 presents the results of this classification for each 
grade and foi those of oui total sample who were classified 
Comparisons may be made fust with tile distribution toL Min¬ 
neapolis Our sample appears to be slightly skewed towaids 
the higher occupational levels Pavt of this is accounted for m 
the number who weie unclassified Despite that, 46 per cent ot 
o\u sample have fathers in the upper three occupational 
gioups, and 49 per cent of the Minneapolis male population 
are in these thiee groups There aie moie sti iking disciep- 
ancies, howevei, when our sample is compared with that foi 
the male population of the United States as a whole It is 


150 




PARENTAL OCCUPATIONS OF CASES IV THrS STUDY DISTRIBUTION OF THE KNOWN OCCUPATIONS OF FATHERS WHO WERE LIVING, EMPLOYED 


NORMS I OR MlNNfcSOl A CLERICAL 1PST 





151 



IIMJIAIIONAI AND PSH'IIOIUGICAI MLASURLMFNl 


unlikely that noims developed in a single community would 
be typical of all communities Noims developed m a single 
locality, when well descubed, aie moie useful than those 
derived fiom many divcisc populations, a combination of 
which may not be typical of any one situation 

Table 5 picsents the condensed giade norms 0 foi the 
decile points foi boys and gills sepaiately in giades eight 
through twelve on the numbei checking (Test I) and name 
checking (Test II) tests of the Minnesota Vocational Test foi 
Clencal Woikcrs These nouns weie dcnved fLom ogive 
cuives constiucted fLom the distubutions of cases including 
all ages within each guide The numbei of cases and descnp- 
tion of subjects at each giade level have been lepoited earJiei 
in this article The test user who wishes to apply this test 
to subjects at these giade levels should considei whcthei his 
population is similar to the one used for calculation of these 
gLade noims 

Some ptisons will picfei age norms, and foi this leason 
we are piesentmg in Table 6 age noims foi these same sub¬ 
jects who wcie enioiled in giades eight through twelve We 
lecoinmcnd the use of the giade noims rvlicnevei possible, 
howevci, as they lepiescnL actual giade populations which 
have been described The age noims do not include an eirtnely 
representative sampling at these ages since we included only 
those pupils enrolled m school in grades eight through twelve 
Furthermore, unequal numbers weic selected at the various 
grade levels Actually, howevci, the similarities between age 
and giade noims are moie stnking than the differences The 
giade eight norms are similar to the age noims foi foul teen- 
year-old pupils, foi example Also notLce the stnking resem¬ 
blance between the norms foi eleventh grade and seventeen- 
ycar-old pupils 

In conclusion, it is suggested that the grade norms should 

“Complete percentile norms ate available m reference 10 aiul fiom the test 
diatribntois, The Psychological Corpcnation, 522 Fifth Avenue, New York City 
For all pinctical puiposes, however, (.he less refined interpi elutions. of the test 
scores will be ail that are requited 


152 



NORMS I OR MINNI-SOJA CILKICAI II Si 


'1 ABLE 5 

CONDrNSCD GRADE NORMS I DR DOYS AND (.DO 5 IN rRAUKS l I( II f THROUGH HVCr VL ON 
1ESTS ( AND H OF TfJF MIWDSDTI V0LJ1TJ0N 11 Tl’il IOR CJFH1C4J YVORXflHS 


Scoie Score 


Deciles 

ICHti 

lest II 

1 est 1 1 til II 

— 


Giadc VIII 



Males (N 

— 2U) 

Females (N 

- 288) 

10 

1 10 

115 

165 

160 

9 

108 

99 

l->1 

1 -0 

8 

JO! 

V! 

IN 

IDS 

1 

9J 

N6 

no 

102 

G 

H9 

81 

10a 

96 

5 

B-> 

77 

mi) 

9J 

4 

B0 

72 

95 

87 

3 

7fi 

G7 

90 

H2 

2 

72 

61 

85 

7H 

1 

fn 

^7 

76 

70 

0 


15 

2a 

__ 15 



Guide IX 



MoIch (N : 

= 332) 

Females (N 

= 327) 

10 

iss 

165 

180 

180 

9 

120 

US 

IJ1 

I2H 

3 

no 

103 

I2J 

119 

7 

KM 

01 

116 

111 

6 

, OH 

8H 

111 

lib 

5 

91 

81 

107 

1DI 

-1 

SH 

79 

101 

95 

J 

HI 

71 

95 

91 

2 

78 

69 

91 

81 

1 

71 

60 

8 J 

75 

0 

J° 

_30 

_ 50_ 

U)_ 



Gj ncic 

X 



Males (N 

— 372) 

Females (N 

= 416) 

10 

165 

170 

IH5 

ISO 

9 

123 

121 

111 

NO 

3 

m 

109 

1J) 

130 

7 

106 

102 

137 

121 

6 

102 

96 

1 ’0 

in 

S 

98 

91 

III 

106 

I 

92 

86 

109 

100 

3 

B0 

81 

101 

94 

2 

81 

73 

97 

H9 

1 

7 5 

65 

87 

79 

0 

55 

_ J5 

55 

Jo 



Grade 

XI 



Mnlcs (N 

= 381) 

Females (N 

= 427) 

10 

180 

195 

190 

190 

9 

111 

132 

149 

117 

B 

J2l 

US 

no 

137 

7 

111 

110 

133 

129 

(ii 

106 

103 

128 

124 

S 

101 

97 

J 22 

HR 

J 

96 

9] 

117 

in 

J 

91 

86 

111 

106 

2 

85 

79 

106 

100 

r 

76 

71 

97 

90 

__ 0 _ 

•15 

_40_ 

_55__ 

__ 55 __ 



Gindc 

XII 



Males (N 

— 539) 

Females (N 

~ 538) 

10 

1H5 

ISO 

195 

195 

9 

13/ 

Hi 

151 

153 

S 

!2H 

127 

112 

N5 

1 

118 

119 

115 

137 

6 

III 

11] 

129 

1J1 

5 

too 

101 

122 

124 

4 

loo 

98 

118 

117 

T 

95 

92 

112 

109 

2 

89 

85 

106 

102 

1 

82 

78 

97 

92 

0 

50 

AO 

30 

10 


153 




LDUCAPIONAL AND PSYCHOLOGICAL MLASURLMEN1 


TABLE 6 


CONDENSED ACE NORMS TOR BOYS AND GIRLS FN GRADES EIGHT THROUGH TWELVE ON 
TESTS I AND IF Or TIIE MTNNESOTA VOCATIONAL TEST TOR CIFRICAI WORKERS 



Score 



Score 

Deciles 

Test I Test 11 

Test I 

Test II 


Males (N - 

Age 14 

: 24fi) Pennies 

(N: 

= 297) 

10 

142 

162 

162 


172 

9 

114 

115 

131 


129 

8 

107 

103 

122 


118 

7 

101 

91 

117 


111 

6 

95 

89 

112 


105 

5 

39 

83 

107 


1Q0 

4 

85 

78 

103 


95 

3 

80 

73 

97 


90 

2 

75 

68 

92 


S3 

I 

69 

62 

84 


75 

0 

42 

32 

27 


37 


Males (N- 

Age 
= 323) 

15 

Females (N 

— 345) 

10 

162 

167 

182 


192 

9 

124 

121 

138 


139 

ft 

110 

111 

130 


129 

7 

105 

101 

121 


120 

6 

99 

94 

115 


112 

5 

94 

89 

no 


106 

4 

90 

83 

105 


100 

3 

86 

78 

100 


95 

2 

so 

70 

94 


SB 

l 

72 

60 

B5 


77 

0 

52 

37 

52 


47 


Males (N = 

Age 16 

= 362) Fcmnlcs (N 

= 411) 

10 

182 

192 

187 


187 

9 

127 

135 

115 


14 6 

8 

lU 

LIS 

137 


136 

7 

109 

105 

131 


128 

£ 

101 

97 

123 


121 

5 

100 

91 

US 


113 

4 

9+ 

86 

112 


105 

3 

88 

SI 

107 


100 

2 

83 

75 

100 


93 

1 

77 

68 

90 


84 

0 

47 

37 

57 


47 


154 



NORMS I-OR MINNESOTA CLL1UCAL I I,Sl 


1ARLE 6 (Coat) 


Score Score 


Deciles 

Test I 

Teat II 

Test I 

Teal 11 



Age 17 



Males (N 

= 433) 

Females (N = 454) 

10 

177 

177 

192 

182 

9 

135 

137 

150 

150 

8 

125 

122 

140 

140 

7 

117 

112 

133 

132 

6 

110 

105 

128 

125 

5 

104 

100 

122 

118 

4 

99 

9+ 

11 6 

111 

3 

93 

88 

no 

104 

2 

36 

81 

105 

97 

1 

78 

71 

96 

83 

0 

47 

37 

S7 

47 



Age IS 



Miles (N 

— 26a) 

Females (N = 259) 

10 

132 

177 

177 

172 

9 

135 

130 

152 

150 

8 

1Z+ 

122 

142 

139 

7 

114 

in 

134 

131 

6 

107 

105 

128 

125 

5 

102 

99 

122 

118 

4 

93 

90 

118 

111 

3 

94 

85 

113 

105 

2 

88 

79 

107 

97 

1 

79 

70 

97 

87 

0 

52 

42 

32 

62 

be useful m jumoi and senior high 

schools 

and commeicial 

business colleges which flic concerned with the distnbution ol 

then pupils to the connneicia] classes 

upon the basis of 

aptJtudc 

foi clerical woik lathei than 

upon the basis i 

?f such factois as 

lack of aptitude foi college tiaming. 

This test with its noims 


should contribute toward a moie scientific and wiser counseling 
of pupils based upon all of the peitinent and available infoi* 
mation Noims foi adults employed in clencal occupations 
also might be employed m judging a pupil's clcucal aptitude. 

1 SS 




IlDUCAI'IONAJ AN1) 1'SYCUOJ OGICAL MLASUKLMLNl 


REFERENCES 

1 Andicw, Doiothy M "An Analysis of the Minnesota Voca¬ 
tional Test for Clerical Workcis I," Journal of Applied 
Psychology, XXI (1937), 18-47 

2 And lew, Doiothy M "An Analysis of tlic Minnesota Voca¬ 
tional Test foi Clcucal Workcis II," Journal of Applied 

Psychology, XXI (1937), 139-72 

3 Andicw, Doiotliy M "An Anahsis of the Minnesota Voca 
ticmal Test foi Clerical Workers," Fh D thesis, University of 
Minnesota Libiaiy, 1935 

4 Andrew, Doiothy M "The Const!uction and Standaidila¬ 
tion of a Test for File CIciks," Master's thesis, University of 
Minnesota Libraij', 1931 

5 Andicw, Doiotliy M and Pntcison, Donald G "Measuied 
Characteristics of Clcucal Workcis/' Bulletin of the Employ¬ 
ment Stabilization Reseat ch Institute, Univeisity of Minne¬ 
sota, III, 1 (1934), 60 

6 Andicw, Doiothy M> and Pateison, Donald G "Minnesota 
Vocational Test foi Clerical Workers Manual of Dnco¬ 
ttons" New Yoik The Psychological Cot isolation, 522 Fifth 
Avenue, 1939 

7 Candee, Beatrice and Blum, Milton "A New Scoring Sys¬ 
tem foi the Minnesota Clerical Test, 11 Psychological Bulletin, 
XXXIV (1937), 545 

8 Dvorak, Beatrice "Differential Occupational Ability Pat¬ 
terns," Bulletin of the Employment Stabilization Reseat ch In - 
stitutc. University of Minnesota, III 8 (1935), 46 

9 Giecn, Helen J, Bcnnan, I R, Patcison, D G, ind Tiabue, 
M R "A Manual of Selected Occupational Tests foi Use 
ui Public Employ ment Offices," Bulletin of the Employment 
Stabilization Rest-aich Institute, Univeisity of Minnesota, II 
3 (1933), 31. 

10 Schneidler, Gwendolen G "Furthei Studies in Clcucal Apti¬ 
tude," Ph D thesis, Univeisity of Minnesota Lilnaiy, 1940 

11 Stead, William II , Shaitle, Carroll L , and Associates Orcir- 
pational Counseling Techniques New Yoik American Book 
Co,, 1940 273 pages 


156 



EXAMINING EXAMINERS 


NORMAN J VOWIlI I 
New Yoik Cily Civil Scni(.c Commission 

T HE EXAMINATION ol applicants and the establish¬ 
ment of lists of persons eligible foi appointment to pio- 
fessional positions m any school system is a most impoitant 
task In New Yoik City, the examining woik is pcifoimcd 
by a boaid of seven examineis selected as the lcsult of com¬ 
petitive examination given by the Municipal Civil Scivicc 
Commission In view of the considciablc cm lent intei est m 
the mattei of examinations for teacbei and adnnmstiativc 
peisonnel, a somewhat detailed dcscuptiou of the piocediucs 
used by the New Yoik City Civil Scivice Commission in the 
most tecent test given foi examine! may be of suggestive 
value 

It should be noted that civil set vice examinations, by then 
natiue, are subject to pecuhai and senous difficulties Since 
examinations cannot be lepeated it is not oulmauly practicable 
to obtain evidence as to the validity of specific test material; 
the selection and use of such matenal must lest laigely upon 
judgment and mduect evidence The passing maik is often 
arbitraliLy set by law — usually at the 70 or 75 pei cent point 
— necessitating nice judgment on the part of examineis as to 
the difficulty of the test mateiul in 1 elation to the calibie of 
the, applicants, if the exainineis do not judge the situation 
accuiately, it is then necessaiy to icsoit to the tiansfoimation 
of scores 

Since the examinations aie given as a public scivice and 
the system depends upon public approval, the examinations 
must give the appearance of being just and reasonable, even 

157 



]• DUCAIIONAI AND PSYCHO! CXilCA! MliAStJREMENl 


to the pel son who knows nothing about examinations Finally, 
elements of the examination procedure aie subject to appeal 
and review by the courts, it must, therefore, be defensible 
heime judges who know nothing about examination techniques 
No model pioceduic has yet been developed to meet all needs 
and situations The following account piesents one careful 
and painstaking appioach to the specific pioblem at hand 
Adopted in 1937, there is a statutory icquiiement in the 
New Yoik Education Law to the effect that applicants foi 
examine! positions must be college oi university giaduates 
and possess at least five yeajs of public school teaching experi¬ 
ence To this minimum qualification, the Civil Seivice Com¬ 
mission added the further requirement of thiee yeais of 
administrative experience m the field of education Applicants 
weic also requited to be not more than 49 yeais of age at 
the time of filing application In consequence of a state law, 
only icsidents of New York State weic peimitted to compete 
in the examination 

A total of 114 applications was leceived Of these, 88 
were adjudged as meeting the education, experience, age, and 
residence lequnements, 

The JVi itien Test 

Consisting of Dean Ned H Deaiboin of New York Uni 
versity, President Paul Klappci of Queens College and Dtrec- 
toi Paul M Mort of the Advanced School of Education, 
Teachers College, Columbia University, a special committee 
was designated by the New Yoik City Civil Service Commis¬ 
sion to prepare the written test 1 

The written test, weighted 4, togethei with an oial test, 
weighted 2, and an evaluation of candidates 1 training and 
experience, weighted 4, compiised the entne examination In 
order to be allowed to take the oral test, candidates had to 
pass the wntteu and, in addition, had to pass the oial test to 
be eligible to have their training and expenence evaluated 
Only 61 of the 88 qualified persons appeared foi the written 

1 The writer served as aide to encli of the committees who woiked with the 
examiner test 


158 



EXAMINING 1 XAMINI'RS 


test Thiee applicants withdrew after part of the examina¬ 
tion, leaving a total of 58 candidates 

Divided into foul equally weighted paits and with a ni.uk 
of 65 per cent in each pait as well as a geneial written aver¬ 
age of 75 pei cent icquned to pass, the wntten test was in 
neithei tiaditional objective noi essay foim The abilities to 
be mcasmed did not appeal to lend themselves to usual objec¬ 
tive test treatment The precise abilities taken foi measuie- 
ment may be exemplified by lefeicnce to both the questions 
used in the test and the duections given to candidates 

In Prut I, foi which the candidate was allowed thiee 
hours, the applicant was informed 

"In mting this pipei consideration wilt be given to clarity m defin¬ 
ing the pioblem, cogency of facts used, ouleilv piescntation of thought, 
conciseness of cxpiession, and the geneial effectiveness of the analysis and 
discussion ” 

A single thiee-hoiu essay was to be wntten on one of five 
pioblems of which the following is illustrative 

"It has been suggested that an examining board concerned with 
improvement of its techniques should maintain a research division 

"Analyze this pioposal discussing the functions, the oiganization, 
the personnel, the values, and the limitations of such a division 11 

Anothei example js 

"It is maintained that in examinations for promotion there must 
be full recognition of the contributions which the candidate made in the 
suboidinate position 

"Discuss this policy of recognition hom the point of view of an 
cxaminei in a school system " 

The second part, lequiung foui horns, consisted of 25 
technical questions The duections weie similai to those given 
for the fust pait An illustrative question is 

"A reliability coefficient of 55 can be said to be typical for rating 
personality Units by ordinaiy judgment methods Is tins coefficient 
high or low? What basis do you have for voui answer? What diffi¬ 
culties me involved in the interpretation?" 

In anothei type of question lequmng a longer response, 
the Candidale was directed to assume the establishment of a 
new - supervisory position in the Department of Education, 
was given the duties of the position, the requirements for 

159 



\NI> J'SU HOI IHilCAI Ml ASURl MI NI 


which had been set up, and was asked to state whethei he 
.igued with the i equipments established, whether there should 
be Additional leqimements, and to give a cubical analysis of 
the statements legal cling the oiaJ test to be given 

The foim of Pails III and TV was similai In each the 
applicant was told 

“Tins p.nt of the evnnniniition is a test of vom ability to analyze 
,» given piohlcm, to document joui position, and to employ sound reason¬ 
ing The c\ammcis will he concerned with joui ability to present 
evidence, not Midi die natme of vom attitudes \ou aie lcquued to 
demonsti.itc the depth and hie.idtli of vom scliolajship in ansueuug 
these questions " 

Fom horns weie allowed foi the completion of each pait 
Theie weie 40 items in each Typical questions m Part 
ITT aic 

“5 'Education is a phase of uvih/ntion, not the whole 1 What 
,iie the implications of tins statement foi education as one 
of the societal agencies'* 

“9 ‘No philosophy of education is fundamental until it is based 
on sociology—not on physiology, not even on psychology, but 
on sociology J Is this valid statement? Why ? 

“15 Is it possible foi education tube non p.nth.ui? Why?" 

Examples of the questions m the fouith pait ,uc' 

“5 ‘The veiy fact of contact bclwecn two cultuics tends to cn- 
gciulci fcatmes new to both 1 What basis is thcic foi this 
statement? 

“17 ‘Inciti.i conditions the solid fiamcvioik of society and makes 
cultuic possible* Is this a valid statement? Why? 

“12 'As a gioup, the aged are increasing fastu than the general 
population’ Enumcijue thicc highly significant social 
effects ” 

An etfoit was made to eliminate the deficiencies of cus- 
tomaiy essay testing and to intioclnce the majoi advantages 
of the objective test, whde letammg the vntnes of essay test¬ 
ing, It may be pointed out that the consideiable length of 
the examination made possible both intensive and extensive 
sampling The type of item used in Paits TII and IV has 
been subjected to quantitative analysis by the Social Science 
Reseaich Council with the finding that it is an “extiaotdmarily 
useful” instrument Dr Biigham is of the opinion that 

160 



F \AMFNfMJ I'XAMFNf KS 


the kind of examination question utilized in die thud and 
fouith parts of the test measures "breadth oJ background/' 
though the directions m the test undei consider .1 turn here 
state that depth of scholarship is also to he demonstrated J 
Ceitamly Pait I approaches closely the appraisal oi depth 
aspects and Pait II probes depth to a somewhat lessci degice 
than it evaluates breadth 

The cential difficulty involved in essay examinations is 
mueliabihty of rating To promote objectivity in i.Umg the 
last two paits of the test, the extent of the 1 espouse by candi¬ 
dates was conbtnctcd temporally and thcicforc spatially A 
seven-point lating scale was employed in which Lhc charac¬ 
teristics of best, mediocre, and pom answers were rccoided 
to serve as guides foi the awarding of credits Definite key 
answers weie foimnlated to all the questions m Paits I and 
II of the test and a scale fm the allocation of ci edits was set 
up I 11 all parts of the test, lalmg keys weie constructed by 
two or three exnnimus in conference, partially bv reference 
to relevant literature oj other sources, partially hy examin¬ 
ing candidates' answers to pi ovule a icalistic scoiing basis 
Maiking was performed independently by two 01 three exam¬ 
ines who, after the completion of the scoring, compared then 
latrngs. All discrepancies except those of a trivial ch.it netet 
were noted and candidates’ answers weie iciead to find an 
equitable base foi agicement by the inters as to the in.uk 
meiitcd by the specific answer Final ratings weie the mean 
of the individual examiners’ marks 

For the type of examination employed in Paits III and 
IV, a lating lehabihty of 87 for total score has been found 
for a f 0-question, four-hour social science test 11 In a 40-ques- 
tion examination of similar foim given foi promotion to 
Captain, Department of Collection, New York City, the cor¬ 
relation for total test scoic between two ratcis was ,93. Split- 
half reliability adjusted by the Spcauium-Biown formula was 

. . -4 

”C C Brigham, E.\ a nutting I'eHows/iifi i pfthauiif (Piinicurn Princeton 
Uni vei^iiv Picss, 1535), pp 22-3 

V bid . ]) IF 


161 



LDUCAIIDNAI \Nl) FSYCIIOUKJK’AI ML*AbUHliMPWr 


92, while the standard chol of measurement was 3 30/ 
There appears to be fanly substantial evidence that the kind 
of examination constuicted is satisfactouly iehable, 

Much of the basis upon which the widcspiead belief m 
essay test unreliability tests seems to be a derivative of experi¬ 
mental findings arising from biased investigations The bias is 
a result of rating without the use of keys so that differences 
between rntcis aic differences between judgments as to the 
nature of correct responses and the magnitude of the credits 
to be awarded to paitially collect answers as well as diver 
gcncies in the appraisal of particular responses It appears 
exceedingly probable that there would be differences of opinion 
among experts in the latmg even of many multiple-choice 
questions if the experts weic not provided with a scoiing key 
In the present instance the formulation of key answeis and 
rating scales, ficqucnt conlcienccs among rateis, and the use 
of several raters tend gicatly to eliminate unreliability of rat¬ 
ing hi each part of the wiitten test 

The essential, significant chaiactenstic of a test is its 
validity Reliability is only of incidental importance since a 
test may be reliable without being valid but cannot be valid 
unless it is also reliable Unfortunately, in the written test as 
m the 01 al and experience measuies, it is not possible to com 
pute a validity coefficient in teims of an acceptable cuter ion 
of ability on the job No satisfactory critcuon exists, only 
one candidate was appointed subsequent to the examination 

A validity judgment may, however, be predicated upon two 
elements, the backgrounds of the special examining committee 
and the reasonableness of the appearance of the examination 
Both factois support the belief that the wutten test is valid 
The examining panel consisted of pi eminent educatois highly 
experienced in the selection of personnel Also, the wutten 
test ranged widely over many subjects apparently pertinent 
to examining woik, and its length was sufficiently great to 
make validity an exceedingly piobable attribute of the test 

‘Bureau of Research, New York City Civil Service Commission, “Selection 
ot Captains in the New York City Department of Correction, 11 Public Personnel 
Quarterly, I 1, 6-7 


162 



EXAMINING LXAM1N1 RS 


The final factoi of gieat impoitancc is the differentiating 
capacity of the test In tcims of maxima of 100 pei cent, 
the applicants’ scoies aie set foith below 



PnrtI 

Pan It 

Part til 

Pail IV 

Mean 

62 1 

S3 7 

+7 1 

50 0 

Standard Deviation 

15 1 

11 6 

12 a 

100 

Highest Score 

94 6 

7K 

68 6 

77 1 

Lowest Score 

35 0 

26 3 

21 4 

25 7 

Range , 

59 6 

■VB 1 

V7 2 

SI l 


Test scoies sepaiate well among candidates Applicants 
aie distributed over approximately 50 pci cottage points, about 
half the total possible lange It is the middle half of the scale 
which is occupied by candidates’ scores The langc is loughly 
from 25 to 75 pei cent except foi Pait I whcie scoies aie 
distinctly higher The highest maik foi the wiittcn test com¬ 
bining alt font parts was 76 7 pci cent The passing maik 
was 75 per cent 

It follows that cithei the wtitten test was too difficult or 
the candidates weie too pooily equipped, Either conclusion 
suggests the desii ability of tjansmuting oiigmnl scores into 
highei marks, If the test was too diflicult, some of the failmcs 
should be passing persons If the applicants aie defective, the 
condition is unfortunate but laigcly the pioduct of the ligid 
statutoiy leqimements limiting applicants to paiticnlai gioups 
An extensive publicity campaign had insured that all oi prac¬ 
tically all qualified pei sons were aware of the opportunity to 
compete in the examination 

The adjustment of maiks involves the necessity for detei- 
mming the natiue of the tiansmutation pioccss, In each test 
part, the mean was taken as the point of reference and denomi¬ 
nated 75 pel cent Distances m standaid deviation units above 
and below the mean fixed the piccise pcicentages awarded to 
candidates, 

Thus, of the 58 candidates, 29 weic passed in the written 
test The purpose of the examination was to place on an 
eligible list those pcisons who appeared to be qualified foi 
the position of cxaminei With this guiding punciple in mind, 
it was considered dcsiiablc that the better half of the candi- 

163 



imUAMUNU AND PbH’IKJI (KI1CAJ MLASUllLMtNl 

dales taking the wutluu test he given the opportunity to sub¬ 
mit to fiuthci examination It was consideied that all those 
in the uppei half of the gioup had demonstiated the posses¬ 
sion ol a compmatively autpUble minimum of scholarship 
Rescaling, then, involved taking one point of icfaence instead 
ol anothei It was believed that incompetents who managed 
to slip by in this pioccss would he cnughl in the tests which 
wcie to follow An oial test then was adnunistcied to the 29 
pei sons who had siuvived the wiitten 

The Otn! Test 

The test was given in two p.uts equally weighted and 
scpaiatecl in time by about si\ weeks The fust pait was 
designed to measme technical competence, the second set out 
to appiaise judgment, clearness and quickness of compiehen- 
sion, nunnci, appeal ance, and speech 

Tn Part I of the tcchmcal-oinl test, the 29 peisons who 
had passed the wutten examination weie divided into six 
gioups, five with five peisons and the sixth with foui peisons 
Foi each gioup a demonstiation oial examination was 
enacted The demonstiation oials weie ostensibly given for a 
paiticulai job 7'he paiticuLu jDOsitions foi which tlie demoli¬ 
sh ations. wcic held wcie teachei of English in the high 
schools, teachei of economics m the high schools, psychologist, 
leseaich assistant, elementaly school principal, and director 
of adult education 

One gioup of candidates obseived one demonstiation, a 
second gioup obseived anothei, and so on In each case both 
demonstiation examiner and subject weie members of the 
examining division of the Commission Into each demonstia- 
tion ceitain delects and virtues were intioducecl both with 
regard to the demonstiation exammei and the demonstiation 
subject Demonstrations were written and planned m advance 
Candidates weie leqimed to late both participants in the Intel- 
view which was enacted Each demonstiation lasted foi about 
one-half houi Candidates weie pei nutted to take notes while 
the demonstration was in piogiess and then allowed an addi¬ 
tional fifteen minutes foi note taking 

164 



J VAMININC/ ) VUIINJ KS 


Following the demonsti alien, candidates retnecl to an 
adjacent loom and were summoned individually foi an oial 
examination befoie the examining panel This oial examina¬ 
tion lasted for not less than one-half hom 

Candidates weie lated by the panel in nccoidance with a 
set of dncctions which had been foinuilaled in advance and 
jn accoid.mce with ciiLeua picp.ued pnoi to the dcmonsUa 
tion and adjusted aflei the demolishation to hi the ptifmm- 
ance which had been ohseivcd The meinbus of the panel 
viewed the demolish ation at the same lime as the candidates 
in 01 dei to be able to adjust ncuuatcly the ciiten.i foi mting 
to accord with the demonstiation ohseived by the candidates 
The ratings leceived by candidates weie deteiiiimcd by the 
lutings they had given to participants in the demonsti ation 
and by the adequacy of the snppoLi they weie able to adduce 
foi then Jatings The candidate's evaluation of the demon¬ 
stration examiner was required fo he suppoiled by ohsciva- 
tions on the cxammei’s attitudes towaid the subject, liis skill 
in questioning, and the general conduct of the inteivLcw The 
evaluation of the demonstration subject was jeqmied to be 
supposed by observations on speech, munnu, judgment, and 
appeaiance 

Of the 29 who had taken Pait 1 of the oial, 16 qualified 
to pioceed to the second part That an applicant was “quali¬ 
fied to proceed 1 ’ did not necessarily mean that he passed Pait 
T, since a general aveiage of 70 pei cent m the oial test as a 
whole was lequued in order to pass Foi example, candidates 
who obtained 60 per cent in the hist part of the oral pro¬ 
ceeded to the second pait of tire oral with the possibility of 
passing the entire oial test only if they obtained a seme of 80 
per cent on the second part, which would give them the 
required average of 70 per cent The maiks leceived by the 
16 candidates who qualified in the fust part of the oral were 
Mmk Fiequency 

60 0 , . , , . 3 
60 8 . , , , 2 
62,1 , ,,, » , < 1 


165 



i DlTCAllONAI AND PSYCHOIOGICAL MEASUREMENT 

3 
1 
1 
1 
1 
1 
2 

Only tluee peisons obtained scores of 70 pei cent or bettei 
in the fust part of tile oul It is indicated, then, that the 
large majoiity of the candidates who appeared foi the second 
part of the oial had alicady exhibited mediocrity with regard 
to technical competence, 

Theie is a sizable disci ep.mey in score between Part 1 and 
Part II in only two cases. In one case, a candidate who had 
obtained 75 8 in the first pait icccivcd 60 8 in the second In 
the othei case, a candidate who had obtained 74 2 in Pait l 
icceived 59 0 in Pait II The point of these data is that can¬ 
didates weie consistently pool. The fact of the matter is that 
most of the candidates pei formed m mediocie fashion in the 
fust pait of the oial and merely con/iimed then mediocrity in 
the second pait of the oral, already having shown mfcuoiity 
in the written test 

The 16 individuals who took Pait II of the technical-oral 
test weie divided into eight groups of two For each group 
of two peisons, examined sepaiately in a particular half day, 
two types of situations weie set up In the hist type of situ¬ 
ation, the candidate was dneeted to assume that he had been 
appointed to the position of examine! and that he had been 
seivmg m tins position foi about five yeais The candidate 
was told that he would be visited by a pel son with whom he 
was to talk, for about half an hour and that he was to conduct 
this interview as naturally and effectively as he could Foi this 
examination, one person, Professoi Robert K Speer of New 
York Umveisity, acted as the visitor and assumed a diffeient 
role foi each group of two candidates The loles assumed 
were representative of a patent association, assistant exam¬ 
ine^ colleague on the boaid, reporter, representative of the 

166 


63,3 
65 0 
65.8 . 
66,7 

74 2 

75 8 . 



EXAMINING I-XAMZNIRS 


Civil Service Commission, visitoi fiom Sioux City inteiesteci 
m teaching peisounel, failed candidate, and lepiesentative 
of a teacher tiaining institution 

A second type of situation was established after the con- 
elusion of the candidate’s intei view with the visiten This 
consisted of tJnec ot moic questions Examples of the fiist 
kind of question aie 

Can education be ictonstmeted thiougli uscarcli? 

How would the adoption of a pnitiuilni philosopliv of examining 
m New York City affect educational practice and thinking throughout 
die whole country ? 

What do you umsidei to he the irnjoi vutues (or defects) of our 
educational system in the United States ? 

Foi the second question, the candidate was given a quo¬ 
tation, asked to tell whether he agieed. ot dlsagieed with the 
quotation and what the implications wcie of his agreement 
oi disngieemcnt for the work of the Boa id of Examine! s in 
New Yoik CiLy Some typical quotations ate 

"The dnngei which conies fiom emphasising the significance of 
contempoiiiiy changes is ihat hasty and unsound lcvisums will he made 
m the cuniculum 11 

"In the solution to educational piobkms lies the solution to nil 
social pioblems, 1 ’ 

"The mam function of education is to perpetuate demoemcv “ 

"Adult education should be limited to educable adults 11 

Foi the thud question, the candidate wns required to talk 
for several minutes on any topic which he deemed to have 
implications foi the work of the Boaid of Exainineis This 
was followed, wheie appropnate, by having membeis of the 
panel question the candidate directly m order to obtain, 
clarification oi amplification of one or more points made by 
the candidate It js noted that members of the panel weie 
fiee to ask questions of the candidate at any time m order to 
explore a statement by the candidate 

The diiect questioning by the panel was introduced by 
having the candidate talk foi several minutes on his experi¬ 
ence and backgioimd mainly m older to give the candidate an 
oppoitiinity to “waim up" piioi to the questioning The rat¬ 
ings given to candidates wete made in nccoidance with wnt- 


167 



1 mU'A’IlDNAl AN11 l»SYtT10U)GlCAL MEASUREMENT 


ten du actions adopted by the examining panel In the situation 
m which the candidate assumed that he was alicady an exam 
inci, the following raling ciitcua wcie employed soundness 
of position taken, cogency of discussion, clanty of discussion, 
penetration of treatment, lime taken foi effective organization 
of 1 espouses to the visiloi, mannei and attitude adopted to 
wind the visitoi, quality of speech, and appealance In 
the situation involving mo it duect questioning, the follow¬ 
ing cuteiiR wcie used importance of matenal selected, sound¬ 
ness of position taken, lelevance of material selected, clarity 
of piesentation, penetration of treatment, quality of speech, 
manner and attitude adopted towaid the panel, time taken for 
effective organization of responses, and appearance 

Information for the Lating of the five factors was supplied 
both by the duect questioning and the assumed examiner 
situations, In the latmg of the five factors, a rating scale was 
employed which langed fioni 0 to 100 per cent The stand¬ 
ards set were 

"Unacceptable candidates should be given latmgs below 60 per cent 
Ratings between 60 pci cent and 75 pci cent should be given to candi 
dates whose diaLacteustics considered m this pait of the test are only 
very slightly infeuoi u\ level to those of a lugh-guide examiner Ratings 
above 75 pei cent should be given to candidates who undoubtedly 
possess a high level of the characteristics defined m this part of the 
examination 11 

There were three examiners in the first part of the teeh- 
mcal'Oral. Joseph G Cohen, Dnector of the Division of 
Graduate Studies, Brooklyn College, NcdH Dearborn, Dean 
of the Divrsion of General Education, New York University, 
Maigaiet V Kiely, Dean, Queens College 

Because the second part was less susceptible to objective 
rating, the examining committee was increased to five in 
older to minimize subjectivity Ned H Deni born of New 
York University; Willard S Elsbiee, whose special field 
is teacher personnel, of Teachers College, Columbia Univer¬ 
sity; Margaret Y Kiely of Queens College; Jesse H, Newlon 
ot Teachers College, Columbia Umveisity, Ordway Tead, 
President, Board of Higher Education in New Yoik City, 

168 



IAAMININCt 1 'XANUNLUb 


The traditional deficiencies of oral tests aie well known 
and include in civil seivice examining the difficulty of achiev¬ 
ing both the tact and the appeaiance of satisfactoiy ichability 
and validity Appeaiances are of consideiable nnpoitancc in 
public peLsonncI administiation. Not only must the examina¬ 
tion be an effective mstuimenl but also it must avoid the 
impiession of being arbitiary, unfau, m capncious even 
though it is none of these in fact The difficulty is gencially 
that of describing adequately the basis for ratings and of 
connecting clearly the lating scale with the candidate’s pel - 
formance in the deteimmation of marks. It must be piovcd 
that marks aie accuiate and unbiased, This necessity was rec¬ 
ognized and met by setting foith m writing the natme of the 
scoring scales, ciitena, and standaids used, and by keeping 
stenotype and phonogiapluc lecoids of all questions and 
answeis The effort to have the examining panels be iathci 
laigc and lepiesentative of diveise educational viewpoints 
and to have them consist of leadcis in the piofcssion was 
also consideicd to contilbulc towaul the objective of coupling 
seeming with actual validity 

The technical requisites for ichability seem to be present 
The average inter cor relation of the examine! s’ ratings on the 
first part was 91, making an estimated reliability foi their 
composite latmgs of 97, by the Spearman-Brown formula 
The average inter cor relation among the examineis on the 
second pait was 765, making the estimated leliabihty foi 
then composite ratings 95, The average difference among 
examiners in Part I of the oial is 2 2, in Part II the aveiagc 
difference is 5 4 Scores were m five-point units as 50, 55, 60, 
65, 70, so that the disparity in glades is about half of one 
point on the PaU I lating scale and about one point on the 
scale in Pait II, 

Since quantitative appraisal of the validity of the oral 
tests is impossible, the problem must again be appioached 
logically Validity lefeis to the degiee to which a test meas¬ 
ures what it sets out to measure The oral attempted to 
evaluate ability to judge applicants for educational positions, 

169 



LDUCAUONAL AND PSYCHOLOGICAL MEASUREMENT 


to analyze weaknesses and strengths in oial examining meth¬ 
ods, to deal with visitois, to display good judgment and com¬ 
prehension, and to exhibit a satisfactory appealance, mannei, 
and speech Situations weie formulated with the explicit piu- 
pose of measuring these factois, all of which appeal to be 
significant samples of the examining task, so that fiom the 
viewpoint of job analysis the oial appeals to be acceptably 
valid 

The Combined Scoies 

When the scoies for both paits of the mal weie combined, 
it was found that only one candidate had achieved a passing 
rating At this point in the examination, however> adjustment 
of marks to pass a greater number was deemed undesirable 
There are several leasons foi not tiansmutmg marks in the 
oial test. In the oral, the identity of candidates is known In 
the written, identity is concealed by having candidates enter 
then application numbeis in place of then names c To tians- 
mute maiks whcie identity of applicants is known is to make 
possible the charge of manipulation Furthci, the written 
was followed by othei tests able to weed out the unfit, the 
oral was to be followed by an experience test m which eveiy 
petson admitted to the examination was ceitaiu to leceive a 
passing mark because all possessed the pi escribed minimum 
education and experience qualifications Moieovei, veiy sub 
stantial opportunity had been afforded to aspilants foi the 
position of examine! to prove themselves Applicants had 
been examined at fom sepai ate occasions in the written test 
and at two different times in the oial Finally, the mattci of 
standards is highly relevant in deciding whcthei oi not to 
rcscalc marks 

The position of examinei is of the utmost importance in 
a school system The examinei is lesponsible for the selec¬ 
tion of educational personnel and therefore, in a latge meas¬ 
ure, for the quality of the teaching done and the manner in 

“The practice of the New York City Commission is to affix lotting numbers 
to all written test papers and to detach the application numbci from answer 
sheets The applicant knows his application numbci, but not his rating number 

170 



EXAMINING LXAMlNLRb 


which the youth of the city is taught and molded The posi¬ 
tion pays $11,000 a year and is held foi life after a six-month 
piobationary period which is not made effective since no ap¬ 
pointee to this position has evei been dischaiged after pro¬ 
bationary appointment It also must be home m mind that 
educational piactice in New York City affects to a degree 
educational piactice in the remaindei of the countiy It seems 
leasonable to believe that under these conditions a high stand¬ 
ard is desirable for this position The position was taken 
by the Commission that passing only one of 58 persons taking 
an examination of this type is not evidence of an unjustifiably 
high set of standards when the numbei of jobs to be filled is 
very small 

A great low arose aftei the eligible list of a single name 
was published It would be inteiesting and instinctive to take 
up the controversy fn detail, but such a discussion belongs 
elsewlieie Some of the objections can be laid to a lack of 
imdcistanding of fundamental mcasuicment principles. 

The examination was leviewcd thiee times by the couits 
and once by a committee on manifest enois established by the 
New Yoik City Civil Service Commission First of the many 
and varied inteiprctations of the data came with the appoint¬ 
ment of the committee on manifest errors to hear and judge 
candidates’ appeals The usual pioceduie of the Commission 
is to refer all appeals to a board of three membets of the 
Commission staff In view of the importance of this examina¬ 
tion, howevei, and the necessity of eliminating any suspicion 
of bias, the special panel was constituted Its pcisonnel con¬ 
sisted of Arthur A Ballantme, noted Inwyei and Undersea e- 
tary of the United States Trensuiy under President I-Ieibeit 
Hoovei, Charles J Piepei, Professoi of Science Education 
and head of the Department of Science Education at New 
Yoik University; William F, Russell, Dean of Teachers Col¬ 
lege, Columbia Uiuveisity In a ieport dated May J, 1938, 
the Commission found “no manifest eiioi in the examining 
methodology or in the constitution of the examining panel,” 
stated that “the pass maik was set neithei too high noi too 

171 



l'DUCAl IONAJ AND PbYClIOl OCilCAI M LASU REMEN 1 

low in i elation to the level of competency lequned/ 1 and con¬ 
cluded that there was "no manifest enor m the rating of any 
candidate " 

Suit to invalidate the test was then biought by seven of the 
failed candidates It was held by the New York Supreme 
Court "that the technical-oial test against which the pnncipnl 
assault was made was meticulously prepared and impartially 
administered, that every safeguard to insure fairness and 
equality of competition was provided, and that the standaids 
used in rating the competitors were in legal contemplation 
objective and reviewable," 

The failed candidates had greatci success with the State 
Appellate Division Five justices of the Appellate Division 
concuned in finding the oral examination invalid r Ihe justices 
disagieed quite strongly in regard to the selection of the 
ground upon which to icst then conclusion One stated that 
it was illegal to limit the eligible list to one name; a second 
was impiesscd with the "comparative incompetence" of the 
sole passing applicant, others interpreted the evidence to point 
to the intrusion of ideological considerations in the technical- 
oral test 

The final word came with the decision of the Court of 
Appeals which disagreed with the Appellate Division as to 
why the oral test was illegal but agieed that the test should 
be given all over again 

Following these vicissitudes, the New York City Civil 
Service Commission held a new oial test in 1940 This time, 
three candidates weie passed, The applicant who had been the 
only one to qualify in the previous test was included among 
the three who were successful in the new one, 


172 



NEW CRITERIA FOR OLD 


r R SARDJN AND t, S BORDIN 1 
University of Miniieiola 


I F ALL the liteiatuie on the piediction of college giadcs 
weie to be assembled in one place, the outstanding chaiac- 

tenstic would be the almost umveisal agreement that con ela¬ 
tion coefficients highei than 70 aie practically impossible with 
existing methods As a mattci o( fact, Segel has collected 
ovei a hundied such studies only to discovei that the median 
piedictive validities of high school scholarship, tests of gen¬ 
eral achievement 01 aptitude, and tests of specific aptitudes 
oi achievements weie 54, 44, and 37 lespectively 4 

In studying the factois which aie responsible foi these 
i datively low coefficients, om attention is immediately focused 
on the nature of the ciiteiion—the honor-point ratio Com¬ 
monly used by colleges and iiniveisities as an index of the stu¬ 
dent’s achievement, this summaiy figme lcpiesents attainment 
in many diffeient kinds of courses taught by various kinds of 
teachers with different standards of measurement, 

Two characteristics of this criterion are of importance for 
predictive efficiency, namely, its unreliability and its heteroge¬ 
neity The first characteristic, unreliability, has not really 
been measured effectively, but can be estimated by logical 
analysis It is agreed that even with improved methods of 
measuring attainment in college couises a semi-intuitive, hit- 

: We aie indebted to Piofessor E G Williamson for stimulation and advice 
in the formulation of this paper We gratefully acknowledge his permission lo 
uae port of the data contained in his study Prediction of Success in the Arts 
College to be published in bulletin form by the University of Minnesota 

2 Dnyid Segel, Prediction of Success m College (U, S Office of Education, 
Bulletin 1934, No IS), p, 70 See also Daniel Ilarm, “Factors Affecting Col- 
lege Giadis A Review of the Liteinture, 1930-37," Psychological Bulletin 
XXXVII (1940), 125-66 


173 



I DUCA'I IONAL AND I'SYCIIOI OfilCAL MEASUREMENT 


01-miss judgmental factoi still remains in the grading process 3 
That this would create a mcasuic of unreliability in the indi¬ 
vidual coiuse grade is undeniable As long as each teacher 
has a set of standards, individually dcuved and reflecting ,i 
somewhat unique seL of objectives, so long will giades retain 
their unreliability When we compound the urn liabilities of 
the individual comse grades—which wc do in computing honoi 
point ratios—it is impiobable that the ichability of the final 
criterion will approach the icliability of the pjedictois 

If it weie possible to establish peifect reliability of course 
giades in Individual subjects and of the honoi-point ratio, the 
second characteristic of the criterion, heterogeneity, would still 
lemam to inteifeic with piechction Foi students who aie 
taking courses in natiual sciences, mathematics, social sciences, 
and languages in vaiyiug combinations, the cntenon repre¬ 
sents a complex of many factors each of which logically ought 
to be sampled by the components of the piedictive bnttciy It 
is self-evident that the 11101c complex and heteiogeneous the 
factois m the cntenon, the moie difficult becomes the task of 
assembling a piedictive test battciy which will adequately sam 
pie this aggiegate without simultaneously mtioduung mto the 
predictive index other extiancous factors Oui task would 
be solved if we could assemble a senes of pme measuies foi 
each component in the criterion, Pme tests, howevei, have 
not yet been created The early promise of the factoi analysts 
that a pme test was possible has not yet been lealized 

A word of caution is in order foi those who would hasten, 
aftei having discussed the unreliability and heterogeneity of 
the prevailing cntenon, to do something about it The unie 
liability or reliability of a criterion is only one factoi in pie- 
diction, Of equal importance are the reliability and validity 
of the piedictive battery As already indicated, techniques 

a Thia is not to imply that judgment are to be abandoned By learning lo 
avoid the pitfalls and fallacies in human judgments, teachers can improve the 
quality and consistency of their ratings Several writers have treated at some 
length the common errors in miking judgments See H E flurlt, Principles 
of FrftploytrwHt Psychology {Boston Hough tort-Mifflin, t > Chapter II, and 
M S ViteleSj Industrial Psychology (New York 1 W W Norton Company, 
1932), Chapters IX, X 


174 



NEW CRIITRIA FOR Oil) 


have not yet been developed for ci eating tests which will he 
pure measuies of any single factoi. Thuistone’s utilization 
of factor methods in his Pnmaiy Abilities tests has not yet 
passed beyond the experimental stage In fact, fiist lepojts 
have been conflicting 4 While the ^liabilities of the picchctive 
tests such as the Amencan Council and the Ohio Psychological 
Examination distiihute aiound 90, these alone do not oifci 
hope foi a gieat deal of improvement m validity Thus, 
reseaich ingenuity must be applied to the picdictoi vaiiables 
as well as to the criterion 

One final limiting aspect of prediction must be taken into 
account by the research woiker befoie pitching his aspnations 
too high This is the indeteiminancy principle that Heisen¬ 
berg has formulated for piediction m the physical sciences 
Present-day thinkers iccognizc that spontaneous and uncon¬ 
trolled factois aie always present These cannot be foicsecn, 
they will introduce a measuie of enoi in any foiecast, Among 
such factors to he found m the prediction of academic achieve¬ 
ment ale momentaiy motivations such as health conditions, 
social distiactions, sexual disti actions, home conflicts, tem¬ 
porary moods, sets, fatigue, and so on Because of these not 
readily controllable elements, it would be safe Lo guess that 
even with peifeetly reliable criteiia and with statistically 
infallible piedictive tests, the uppei limit of multiple concla- 
tion would still not exceed 95 But such a pessimistic outlook 
need not be discouraging to fiuthei research Much loom 
remains for improvement The increase in piedictive efficiency 
of a correlation of 70 to one of 95 tepresents a range of 
about 41 per cent improvement over non-test estimates 

The ciux of the pioblem of selection and admission of 
students hinges upon accmate prediction mstuiments Pre¬ 
diction seives the piupose of assisting college authonties to 

*J M Stain aker, "Pnrrmy Mental Abilities,” School ami Society, L (19)3), 
868 72 See also R G Uernrcuter, “Primary Ability Tests Applied to Engi¬ 
neering Freshmen/' Psychological Bullet,„, XXXVI (1939), 548 +9, nml William 
M, Shanner and G Frederic Kndcr, "A Comparative Study of Freshman Week 
Ttsfs Given to the University of Chicago/' Pducatiomtl mid Psyc/iotogicat 
Measurement, l (19+1), 85-92, 

m 



I DITC A'llONAI AND I'SU'IIOJOGICAI Ml- ASURDMtNl 


select students who have a reasonable chance of profiting from 
the college’s offerings Since the elforts of the test-makers, 
educational psychologists, and othei lcseaich workers leached 
a ceiling at foiecasting efficiency of approximately 28 per cent 
bcttei than non-test prediction, furthei leseaich may take any 
of thiee couises 

1 furthei improvement m the reliability and validity of 
the picdictivc battery, 

2 impiovement in the ichubibty of the cnteiion 
measuics, 

3 design of a new cnteiion which will be maie predict- 
able and at the same time acceptable to school 
adiuinistiatois 

At the present time, the first approach appears to be the 
one least likely to bung success, yet it is the one most fre 
quently selected, With the development of the method of factor 
analysis hopes were laised foi a significant increase in the 
efficiency of tests The belief pi evaded that with the isolation 
of factors in a test batteiy the foundation might be laid for 
the construction of puiei tests which m turn would lead to 
more accurate prediction Thus fai this pionnse has remained 
unfulfilled 0 Until now the moie significant contribution in 
test construction has come from the method of inbreeding of 
test items as' utilized by Toops m the construction of the 
Ohio Psychological Examination D By means of this con¬ 
tinuous process of selection of the most valid and most stable 
items, the predictive validity of the Ohio test has at tunes sui- 
passed .60 The promise for further developments from this 
source is at piesent greater than from the method of factor 
analysis The stimulus for furthei advances by the method 
of mbieeding probably will come from studying the contribu 
lions of the alternatives in a multiple choice item 7 But even 
with this contribution the piospects for the near futuie aie not 

^Stalnnkei, op at See also Bern renter, op at 

°H A Toops, ^The Evolution of the Ohio State University Psychological 
Test,” Ohio College Association Bulletin No 113, March 20, 1939, pp 2267-311 

F Kuder, The Construction of Valid Test Items (Unpublished Disser¬ 
tation, Ohio Sinte "University, June, 1937) 


17 £ 



NUY CRI11 KH TOil 011? 


vciy bright foi a large increase m l cl lability and validity 
□f tests 

The second course, met easing the reliability of the cutcuon 
measures, sporadically has been the topic of intense discussion 
in educational codes As fai back as 1913 Staiclr appealed 
for moie stable grading standards The majoi fnctois which 
he cited as the cause foi instability of maiks still aic applic¬ 
able today 

"(l) Differences among standauls of (hffeient schools, (2) differ¬ 
ences among standards of different teachcis, (3) diffeicnces in the 
relative values placed by different teaclicis upon various elements m a 
paper, and (4) differences due to pure inability to distinguish between 
closely allied degrees of ment " a 

In the last decade a new type of emphasis in the giading 
pjocess has aiisen laigely through the influence of Tylci 0 and 
the Progressive Education Association evaluation woik The 
effoits of this gioup have been dnectcd mainly towaid the 
clarification of tcacheis* aims and objectives and the opcia- 
tional definition of these aims and objectives in Lei ms of 
obseivable behavioi These developments have been cliiected 
chiefly at the secondaiy school level m connection with the 
Eight-Year Study. 

The widei application of these pi maples at the college 
level may offer some hope foi the lmpiovement of picdietion 
It is assumed that this type of study will lead to a moie con¬ 
scious and a more stable evaluative pioecss which in tmn 
should seive to make grades moie reliable. Some believe that 
greater homogeneity in the objectives of glades also would 
result ftorn these developments That is to say, many objec¬ 
tives piobably could be identical foi different courses Jf a 
core of common objectives could be isolated and evaluated 
in the same mnnnei m a whole senes of courses, then the 
difficulty of constructing a moie efficient test battery would 
be reduced considerably 

B D(iiud Starch, "Reliability and Distribution of Guides/ 1 Scic/ill, XXXVHI 
(1913), 630 

®Ralph W Tyler, "Needed Research in the Field of Tcsls and Examina¬ 
tions," Educational Research Bulletin, XV (1936), 1SI-5H 

177 



IDULAflUNAL AND PSYCHOLOGICAL MLASUREMJiNL 


The thud approach is most likely, we believe, to bring 
about significant met eases m the predictive efficiency of test 
batteries and is one that piobably would encounter the most 
opposition fiom admimstiators and faculty If we assume that 
the piesent grade criterion of college success lacks adequate 
predictability and that this deficiency wan ants the substitu¬ 
tion of a moie predictable criterion, then is it not logical to 
seek such a criterion? The ansvvei can be only in the affirma 
tivc At least a poition of oul efforts must be directed at the 
possibilities of developing a moie predictable measure of 
academic achievement which at the same time will satisfy 
othei needs of the educational program 

But we also must consider the difficulties of such an under 
taking and be prepared to surmount them Over and above 
the educational and statistical problems are the sociological 
problems which anse from the nature of out educational 
society This society has developed a rigid and inflexible 
attitude toward marks which is likely to resist any but veiy 
stiong ptessutes 

To dislodge the tradition of marks, two forces must be 
overcome fiist, the faculty, who feel that they have a vested 
interest in assigning grades, and second, parents, who, tl indif- 
ferent at times to most phases of education, seldom neglect 
the report card 1110 This tigid adherence to marks has another 
deleterious effect upon the educational process Instead of 
directing their efloits toward mastery of content, many stu¬ 
dents prepare for giades Originally designed to serve merely 
as a record that a student had taken a particnlai course and 
had acquired a certain degree of pioficiency, grades too often 
have become the only goal for many students Foerstet has 
described the situation in pungent terms 

“Once a credit was earned, it was as safe as anything in the world. 
It would be deposited and indelibly recorded in the registrar's savings 
bank, while the substance of the course would be, if one wishes, happily 
foigottcn Each course culminated in a filial examination, if one knew 
one's stuff then, one need never know it again In a subject like rcquncd 

)0 R 0 Billett, Provisions for Individual Differences, Marking and Promo- 
lion (U S Office of Education, Bulletin 1932, No 17) i P 459 

178 



NEW CRITERIA FOR OLD 


English a student deficient in ability might, with cffoit, get a passing 
grade, and then, without effort, pass into semi-illiteracy, yet the icc- 
ord would show, to the day of doom, that he could lead and wnte 
passably 1,11 

All this means that institutions of lughei learning may 
have to abandon or modify the traditional marking system and 
"pioduce a new convention better than the old,” 13 It is thus 
seen that continued enslavement to traditional maikmg systems 
not only interferes with the constnietion of more effective 
selection instruments, but also vitiates some of the funda¬ 
mental objectives of highei education Theiefore, upon the 
shoulders of the educational admimstiator falls the jespon- 
sibility of re-examining the purposes of a marking system—a 
system that he eithei implicitly oi explicitly has set up as 
proper In this re-exainmation he will be obliged to leave the 
way open for the substitution of another marking system 
which will provide optimal satisfaction of these purposes 

Our problem has come into sharp focus a new standaul 
foi gauging achievement in college must be sought This 
standard must palpably be supcrioi to tcacheis' maiks and 
must lest on ccitain logical and statistical pillais Out pie- 
vious discussion of the limitations upon piedictivc accutacy 
for college selection purposes has already indicated some 
of the desirable featuies* first, the measure should have 
lehabihty, second, it should be as homogeneous as possible 
both with respect to scale and to the nature of the factors 
included, third, it must have relevancy for the educational 
objectives to be measmed 

Since the final test of the pi edictability of a cuterion will 
be empirical, wc turn to the data that we have to piescnt 
At this juncture the natiue of the statistical evidence we have 
obtained forces us to particularise in terms of the Iibeial aits 
college, and more specifically, the junior division The undei- 
lymg principles, however, can readily be adapted to other 
college units Before proceeding with the analysis of the com- 

ll Noimtin Foerstcr, The American State Umverjity (Chipel Hill, N C , 
University of North Carolina Press, 1937), p 97 

Ibid, p 146 


179 



I DUCAriUNAL AND PSYOIIOIOGICAL MEASUREMKNL 


pnintivc piedictabilily of the two types of eiitem, a woid 
must be said about the objective of the junior division of the 
libeial aits college At the usk of seeming impel tinently pre¬ 
sumptuous in stating this objective in a woid, the authois sug¬ 
gest that the pnmaiy puiposc of the lust two pie-spcciali 7 a- 
tion yeais in the aits college is to piovidc students with oppoi- 
Lumtics loi cultuial giowth. Gcneially speaking, today's lib 
eial aits colleges by and laige ducct then effoits—sometimes 
futilely—towaid a cultuial goal Although in this context the 
word cultuie is to be looked upon with the giavest suspicion, 
the trend in today's coie cimicula seems to be away fiom the 
hot-house vanety of cultuie for the elite and m the direction 
of the by-piodncts of the best in science and society for all 
In most cases, the Amencan hbeial aits college is bent upon 
pioviding students with a hioad undcistanding of cultuie in 
all of its lamifications 

To be cultuicd, a man must be “moic than an ape-like 
cicatiuc posing undci the mask of hastily acquited diawing 
loom nunneis " 1S The studenL must acquue duimg his pic 
specialization yeais the nidi vicinal qualities and competences 
which go into uch and satisfying living, and which give mean¬ 
ing to his expellences as a membei of society This does not 
imply that a cultuial pattern ligidly common to all is the goal 
of liberal education To di agoon widely differ ent students into 
a legion of legimented automatons, each lespondmg in the 
same way to the same situations, is obviously to be deplored in 
democratic institutions As Eckert has pluased it “Not like 
minded, but ‘fiee’ individuals become the goal of Leaching 1111 
Idiosynciatic behavioi lemains as an outstanding desideratum 
of liberal education 

At this point one of two approaches is immediately appar¬ 
ent foi evaluating these cultural objectives We may xetieat 
to the traditional methods of evaluation—teacheis’ grades 
based on some esoteric combination of impiovised testing, 

1S C, J Wquleiij The Emergence of Human Culture (New York Mocmillan 
Company, 1936), p B 

H Ruth E Eckeit, f, Who Are the Cultured in Our Colleges? 0 Educational 
Record, January, 1930, pp 133-35 


180 



NEW CRIILR[A FOR OID 


dazzling intuitions, and the persistence of the student m 
attending classes, or we may tuin to procedures such as those 
embodied m ceitain umfoim testing piogiams 

Such a measure of cultme could be assembled with the 
Coopeiative Geneial Cultme test as a nucleus, 113 Since 1972, 
the contenL of the Sophomoic Cultuie test has been consider 
ably expanded, and today the student urns a gamut of tests 
from mathematics to aesthetic appiemtjon bcfoie enleunghis 
junior yeai Admittedly, the papei-and-pencil instalment does 
not sample the whole range of culture, neither does rt directly 
tap the important areas of motivation, attitudes, and values 
It is the only method yet devised, however, which has the fun¬ 
damental characteristics without which scientific measurement 
in education becomes a farce, a tragedy, 01 both If college 
administrators and faculties are to decide whethei the Soph- 
omoie Cultme test will satisfy then needs, they must weigh 
it upon scales which cany empincal as well as logical weights 
If it is agieed that the predictability of a pioposed ciitcnon 
is one chaLacteiistic pertinent to its adoption, the predictability 
of the Sophomoie Cultme test becomes a mattei of moment 

One might point out that it has a high reliability—coeffi¬ 
cients in the 90's aie lepoited in the liter atm e—or that it is 
constructed so as to give comparable scores, but the final proof 
of its supenoi predictability must icst upon obtained collec¬ 
tions with predictive tests The remainder of this papei is 
devoted to an exploratory investigation of the predictability 
of the two criteria we have been discussing—teachers* maiks 10 
and the Sophomoie Culture battery. 

The usual technique was employed in assembling a battery 

1D The Cooperative General Culture test may lie procured fiom the Coop¬ 
erative Teat Service, 15 Amsterdam Avenue, New York The other Cooperative 
tests used in this study may be obtained from the sumo souico, 

^Teachers' mirks sveie transmuted to two jear honor-point ratios os 
follows for each ctcdit hom in which nn A was recoideil, three honor points 
were assigned, far each ciedit hour of B, two honoi points, foi ench ciedit hom 
of C, one honoi point, foi each ciedit hom of D, no honoi points, and foi each 
credit houi of F (failing) one honor point wna subtracted The honaL-point intio 
was computed by dividing the total number of honoi points earned by the 
total credit hours earned The Sophomore Culture teat was made up of the 
following tests in the Cooperative scries foi 1936 General Cultme, English, 
General Science, Literary Acquaintance. 

181 



FPUCATIONAI AND PSYCIIOIOGTCAL MEASUREMENT 

of tests foi the selection or 1 ejection of applicants These 
tests wcie coirelated first with teachers' marks and then with 
scores on the Sophomore Culture tests The battery of 
entrance tests which has the highest correlation with the 
ci iteuon could thus be used to select future candidates fot 
admission At the time of the students’ entrance into the arts 
college, scoies on the following measmes weie obtained, 

High school percentile rank 

Minnesota College Aptitude test (form AM) 

Minnesota College Aptitude test (fotm 1926) 
Cooperative English test (Pait I, form 1934) 

Cooperative Vocabulary test (Pait II of English test 
above) 

Coopeiative Contemporary Affairs test (form 1934) 

The group included in this study was composed of students 
who entered the arts college of the University of Minnesota 
as fieshmen in the tall of 1934, and who took the Sopho¬ 
more Cultuie test in the spnng of 1936 in applying foi admis¬ 
sion into Lhe upper division The gioup was composed of 138 
students, 56 men and 82 women Only students were included 
for whom the 1934 entlance test scores were available and foi 
whom high school percentiles were recorded, The group 
studied, though not closely lepresentative of entering fresh 
men, piobably was lepresentative of sophomoies applying foi 
entiance to the senior division of the aits college Any limi 
tation in lepresentativeness, however, invalidates na compari¬ 
sons between different measmes within this gtoup, 

TABLE 1 

correlations between two year honor-point ratios and individual measures 

IN TllE FRESHMAN TESTING 11ATTERY 



Total 

Men 

Women 

High school percentile mnk 

52 

57 

55 

Canlempormy Affaus lest 

,50 

53 

44 

Minnesota College Aptitude test (1926) 

SO 

56 

+3 

Cooperative English test 

41 

50 

41 

Minnesota College Aptitude lest (AM) 

4-9 

49 

39 

Cooperative Vocabulary test 

35 

40 

30 


Table l reveals the usual oidei of correlations between 
teachers’ marks and predictive tests The best single predictor 


182 



NEW CRITERIA FOR 01D 


of grades is the high school peicentilc rank, demonstrating that 
—toaceitain extent—high school teacheis and college teachers 
are influenced by the same factois in assigning grades The 
highest correlation in the table is .57) between grades foi men 
and high school peicentile ranks The lowest coefficient, 30, 
is between college guides foL women and the Vocabulary test 
The othei coefficients fall between these two values. 

TABLE 2 

CORRELATIONS DE TWEEN SOPHOMORE CUITURE TEST AND IND1VIDUAI MEASURES IN 
THE FRESHMAN TFSTINC BATTERY 



Total 

Men 

Women 

Contempoiary Affairs test 

SI 

81 

82 

Minnesota College AptJtnde test (1926) 

77 

77 

,76 

Cooperative Vocabulary test 

68 

63 

72 

Minnesota College Aptitude test (AM) 

67 

68 

66 

Cooperative English test , 

58 

62 

66 

High school peicentile iL ink 

.29 

+3 

21 


Conti ast these con elation coefficients with those in Tabic 
2 With the single exception of the high school peicentile 
rank, couelations between the Sophomoie Culture test and 
the various measiuei* lange from 82 to 58 The Contem¬ 
porary Aftairs test has high piechctive value, as have the 
College Aptitude tests and the Vocabulaiy test The nature of 
the distribution of the English test semes accounts for the 
lower coefficient foi the total gioup than foi eithei the men 
or the women It is especially noteworthy that high school 
peicentile ranks have little predictive value for such a cnterion 
In terms of forecasting efficiencies foi the total gioup, the 
highest coefficient in Table 1 conesponds to 15 per cent, while 
the highest in Table 2 corresponds to 41 per cent 17 

The same trend appears when the multiple corielation 
coefficients of selected batteries ate compaied Table 3 reveals 
the order of correlation between the two criteria and two sets 
of entiance tests Battei y A, composed of three meaSuies 
(high school peicentile rank, Minnesota College Aptitude test 

l7 Forecasung efficiency computed by formula E = 100 (1 — V1 — c 2 ) 
winch gives a measure of the per cent of improvement over non-test piediction 
See J, P Guilford. Psychometric Methods (New York 1 McGraw-Hill, 1936), 
P 363 


183 





educational and psychological measurement 

—foim 1926, and English test), concUtes 64 with grades 
but 77 with the Sophomoie Cultuie test, The conesponding m 
dexes of foiecasting efficiencies ale 23 pei cent and 36 per cent 
When the Contempoiaiy Allans test is added to the thiee 
othci measmes;. (Batteiy B), the condition with houoi-pomt 
ratio becomes 67, and with the Sophomoie Cultuie test 86 
The conesponding foicoasting efliciencies aie 26 pei cent and 
50 pei cent bettei than non-test piediction 

TABLE 3 

multiple correiahon coErncicNTa detwoen datteries of selected entrance 
TESTS AND TWO VEAR HONOR-POINT RATIO, AND SOPHOMORE CULTURE TEST 

TwoYenr Sophomore 

IIonoi-PointRntio Culture Test 

Tola! Men Women 'I otnl Men Women 

LnOnncc Batteiy A* 6+ 67 6-t 77 78 77 

EntuiKC Battery Bt 67 fi3 ,66 86 86 ,87 


*Enti nnee Battery A^lngli school peicentilc tank, Minnesota College Apti 
Hide tot 11526), nnd Cooperative English test 

tEntrance Batteiy B—lugh school percentile nnk, Minnesota College Apti¬ 
tude test (1526), Coopci Hive English test, nntl Contempoinry Allans Lest 

The diligences between these multiple couelations foi the 
two kinds of catena weie tested foi significance The diffei- 
ences foi BaLteiy A weie in the aiea of doubtful validity 
(P < 05 but > 02) , those foi Batteiy B weie well beyond 
the boundaiy foi significance (P < 01) 18 

The significant lesults of this exploiatoiy study m the pie- 
diction of college success may be summaiized as follows (a) 
The substitution of the Sophomoie Cultuie test foi the con¬ 
ventional giadmg system as a cuteuon of college achievement 
maikedly met eases the pLedictive validities of the standaidized 
entiance tests and maikedly decieases the piedictive validity 
of high school giades (b) The lowest validity coefficients 
weie obtained when high school peicentile tanks weie cone- 
lated with the Sophomoie Culture batteiy, The highest zeio 
oidei coefficients weie obtained in conelating the Sophomore 
Culture batteiy with the Contempoiaiy Affaus test Eckeit 

I8 R A Fisher, Statistical Methods for Research Workers (7th ed , London 
Oliver and Bo>d, 1939), p 209 


184 



NLW CIUILRIA TOR Oil) 


icpoits similai findings, She concludes: “Students most con¬ 
versant with the achievements and thoughts of the past, and 
most outstanding in the jealm of book-learning, tend on the 
whole to be those most aleit to the conternpotaiy scene " 10 
(c) A combination of four entiance measuies ietuined validity 
coefficients with the Sophoinoic Cultuic test coiiespondmg to 
50 pei cent jotecastmg efficiency The Contempoiaiy Allan's 
test alone con elated highci with the Cultuic test than did a 
combination of thiee measuies 

Inteipietmg these lesvilts, the Sophomote Cultuic test 
coiielates high with the other objective tests because of its 
close similauty m objectivity of foun, its gieatei icleyancy 
and compithenstveness, and m the oyci lapping of the con¬ 
tent and ability measuies, and coiielates low with high-school 
giades because the lattei appiaise othei a Leas besides those 
involved in the test sampling of achievement Giades 111 
college, conveisely, con elate lnghei with giades in high 
school, and lowei with the standardized tests because they 
measuie aieas outside of tested achievement but similai to 
those measuied by high-school giades This mteipi ctation can 
be fmthei suppoited and extended since the con elation be¬ 
tween the Sophomoie CuUuie test and the two yeai honoi- 
point ratio was only moderately high' 58 foi men and women 
combined, 64 foi men, and .51 foi women Scholastic giades 
and the Cultuic test, even when they piesumably sample the 
same areas of knowledge, ccitainly do not measuie all of the 
same aieas or abilities involved in academic achievement in 
college 

These lesults aie not without pLccedent Foi example, 
Fiasiei and Heilman lepoited correlations between the Thorn¬ 
dike Intelligence Examination and giades assigned subjectively 
and objectively The average coefficients were 45 and 60 
1 espeettvely 20 Foi giades m Fiendi as assigned in the usual 
manner, Thaip found a con elation of 47 with the Iowa Place- 

^Eckert, op p 13 J 

2o G W Frasier and J D. Heilman, "Experiment in leachcr College 
Administration, III Intelligence Testa,” Educational /IdttumstfaUon and 
Supervision, XIV (1928), 26B-7B 


185 



CDUCAilONAl AND PSYCHOLOGICAL MEASUREMENT 


ment test for foieigu language aptitude When an achieve¬ 
ment test was used, the coi relation jumped to 64 21 

From out examination of the problem of piediction, we 
draw the conclusion that a fruitful point of attack is through 
the substitution of a moie tellable and theiefoie more predict 
able measure of achievement This papei has presented data 
which definitely demonstrates that a pencil-and-papei evahta 
tion instrument such as the Sophomore Cultme test is more 
piedictable than the time-honoied giade cutenon But it 
would be foolhaidy indeed for the authors to take the next 
step, that of advocating that this attnbute alone justified its 
substitution foi honoi-point latio This decision lies within 
the province of the educational admimsti atm He must decide 
whether more acemate piediction — a sine qua non of all effi¬ 
cient admissions policies — plus the Cultme test's degiee of 
xelcvancc is sufficient to outweigh those desirable qualities 
which may still be claimed foi the traditional maiking system 
In short, he must decide whether this new cutenon is moic 
acceptable than the old 

A final word foi leseaich The Sophomore Cultme test, 
in common with othei achievement tests, Jaigcly measures 
recall of mfoimation 22 That infoimation is only one phase 
of education must be lecogniyed Othei components of cul¬ 
tural growth — attitudes, values, motivations, goals, and alifec 
tive experience — must be measuied by othei instruments It 
Is hoped that in the not-too-distant futme these impoitant out¬ 
comes of education can be appraised with sufficient accuiacy 
so that we may know how well the Amcncan college functions 
as the vehicle of culture 

2i J B Tharp, “Sectioning Clasuci in Romance Languages/’ Modem 
Language Journal, XII (1937), 95 114 

aa B E Curcton, "Evaluation or Guidance—A Report of the 1939 Sopho 
more Testing Progiam," Journal of Experimental Ldacation, VIII (1940), 
308-40 


186 



A FACTOR ANALYSIS OF A NON-VERBAL 
REASONING TEST 

RODL.R1 I. BLAKLY 
Social Security Gomel 

S OME time ago Dr Andrew W, Brown and the author 
constructed a “Non-Veibal Reasoning Test” foi use at 
the high school level A preliminary lcpoit of its constiuction 
is being published by The Journal of Educational Psychology 
The present aiticlc concerns itself with the lcsuJts of a factor 
analysis of the intcicouelations between the subtests rathci 
than with the actual standauli/ation of the test 

The test was constiuctcd with the idea that it should 
mcasute m a non-veibal mannci the higher intellective proc¬ 
esses of compichensi on, mental alertness, deductive leasomng, 
inductive lcasoning, and spatial iclations 01 analysis The pii- 
mary purpose of this study is to isolate and identify any 
common facto is pies cut and to compare them with the ex¬ 
pected tactois 

Other pioblems which may be eonsicleied in the light of 
the factoi analysis aic (a) a comparison of the factorial 
composition of tests which are variations of Thurstone’s tests 
with the factoilal composition, as deteimined by Thurstone, 
of the tests he used, (b) a reconsideration of the perennial 
problem of the existence of a geneial factor of mental ability; 
(c) the comparison of the factors found in this group of 
tests with factors found in analyses of other tests, (d) a fur¬ 
ther examination of vauous methods of ascertaining the num- 
bei of factors which should be taken out of a correlation 
matrix 


187 



J-DUCAIIONAI ANL) PSYCHOLOGICAL MLASURLMPN1 


All tests file time-limit tests and weie mtioduced by foie- 
cxercises which weie explained by the examinei They weie 
piesentcd m the oulei listed 

1 Manikin —a page of pied figuies of little men The 
figures aie simple line diawings with vauations m the positions 
of arms and legs The pioblem is to diaw a mig aiound each 
manikin which is exactly like a model at the top of the page 
It was thought that this test might be satin ated with the 
Peiceptual Speed factoi The Spcaiman-Biown collected 
icliabdity is 81 

2 Identical Patterns—12 lows of patterns loaned by 
ovcilapping geometrical foims The first pattern of each 
low is sepal ated fiom the othcis by a heavy vertical line 
The patterns aie in 12 vauations each composed of two aides 
and two light tmngles The same size foims are used m 
each vanAtion, the diffeiences being due to i dative positions 
of the components and whethei the foims aie solid oi dotted 
lines Each row contains one or mole patterns exactly like 
the fust one in the iow, and the pioblem is to place a maik 
undei each pattern which is exactly like the fiist one in its 
icspective iow It was thought that this test would be a vam- 
tion of Thurstone's Identical Forms test and consequently 
loaded with the Peiceptual Speed factoi The Speaiman 
Blown coriected reliability is 98 

3 Fitting Pails —each item consists of a solid black 
geometrical foirrij which has been cut into three paits, and 
foui outlined figuies, one of which is the same size and shape 
as the black figure which was cut The problem is to indicate 
that one of the outline forms into which the solid black 
pieces could be made to fit exactly Disciimmation of both 
size and shape is involved foi each item It was thought that 
possibly the factor Visualization or Space was involved in the 
solution of this test The Spearman-Brown coriected reliabil¬ 
ity of the 12-item test is 47 

4 Opposite Sides —each item consists of three diawings 
identical in size and shape The pioblem is to select the 
drawing in each item which is a mirioi image of the othei 

188 



ANALYSIS OF NON-YI'RBAJ iu asoning 11SI 

two drawings Each drawing is a tittle pennant the shape 
of a non-isosceles right tiiangle and may be rotated in any 
position, It was thought that possibly Space and Induction 
might be used in the solution of this test Theie is no ically 
parallel foim to this test although the idea was adopted fiom 
Thuistone’s Flags test The Speaiman-Browu collected le- 
liability is 88 

5 Code —a code consisting of eight boxes divided in 
half is placed at the top of the test Each box has a unique 
group of squares and cucles in the top half and an unusual 
group of tnangles in the bottom half Below the "code" are 
five rows of the little boxes, some exactly like the boxes in 
the code and some with mcoricct paiung of the symbols 
The problem is to place a line undei each box which is chffeient 
fiom the code It was thought that the test might contain the 
Peiceptual factoi The Speannan-Biown collected lelubihty 
is 96 

6, Cncle Grouping —each item consists of fom boxes 
containing little gioups of cucles The giDuping vanes fiom 
box to box One cucle in each of the fust thiee boxes is 
blackened accoiding to a system The pioblem is to discovei 
that system and apply it in blackening a cncle in the foiuth 
box It was thought that possibly Induction would be involved 
in solving this test The Speaimnn-Biown collected lehability 
foi the 12-item test is 98 

7 Foim Senes —this test is the usual senes type with 
only thiee meaningless foi ms used in combination One figure 
in each low is omitted and a blank inseited, The pioblem is 
to indicate which form belongs in the blank It was thought 
that Deductive Reasoning oi Inductive Reasoning would be 
involved in the solution of this test The corrected Spearman- 
Biown lehability of the 22-item test is 86 

8 Cncle Reasoning —a vanation of the Marks test used 
by Thurstone as a measiue of Inductive Reasoning There 
are five tows of gioups of circles and dashes The grouping 
changes fiom tow to row One circle in each of the first 


189 



FDUCATtONA.il AND PSYCHOIOGtCAL MEASUREMENT 


four rows is blackened accoidmg to a mle, The problem is 
to find the rule and apply it in blackening a elide in the fifth 
row It was assumed that this test would contain Induction 
The conected reliability is 94 

9 Foim Relations —this test is a paiallel foim of Thur- 
slonds Pattern /lnafogtes test The problem is to indicate 
one of five choices which beais the same 1 elation to the thud 
liguie as the second beais to the fust. Inductive Reasoning 
or Deductive Reasoning was assumed to be necessary for the 
solution of this test* The coriectcd leliability is 97 

10, Foim Reasoning —at the top of the test is a table 
showing how any two of seven foims could be combined to 
equal another one of the seven Each item consists of thiee 
of the forms in a row The task is to combine the first two 
foims accoidmg to the table and then combine the resulting 
foim with the thud to equal another foim, the final lesult 
to be indicated by undeilining one of five choices It was 
thought that possibly Deductive Reasoning would be used 
to solve these problems The Speaun an-Brown collected 
reliability for the 12-item test is ,98 

The Subjects 

The subjects weie 286 high school pupils from a school 
m a suburb of Chicago All tests were given by two experi¬ 
enced examiners in a well-lighted room All tests were admin¬ 
istered in one 40-minute period Eighty pei cent of the whole 
gioup was between IS and 18 yeais of age The mean Otis 
IQ was 114 About 54 pei cent of the group were boys 
No sex difference was found for combined scores on the whole 
test No grade difference was statistically significant The 
correlation of total test score with chronological age was — 11 
for the age tange of this group 

The Fac.toi Analysis 

The table of intercorrelations (Table 1) was computed 
with the aid of Computing Diagrams for the Teivachoric 
Coirelahon Coefficient (2) Correlations obtained m this 

190 



ANAIVSIS or NON-VI'RlJAL KI-ASONJNG 11 SI 


mfl nnet aie considered by Thmstone (6, p 58) to be applica¬ 
ble to factor analysis In effect the scoies are nowmlizcd jm 
the process of correlation 

The factors (Table 2) weie extracted by the Tluustone 
centroid methods Hcic the problem of the numbci of factois 


'I ABLE I 

IWrrACORRBI ATION9 OF 'JPS15 


Variable 

1 


2 

3 

4 

S 

6 7 

8 5 

• 10 

Manikin 



24 

27 

24 

38 

19 13 

,19 22 19 

Identical Patterns 

24 



08 

17 

22 

46 (6 

15 33 24 

Fitting Parts 

27 


OS 


J7 

22 

20 13 

10 20 22 

Opposite Sides 

2+ 


17 

17 


15 

25 38 

32 39 31 

Code 

38 


22 

22 

15 


26 ,22 

25 3 

5 38 

Circle Grouping 

19 


46 

20 

25 

26 

48 

50 53 49 

Form Senes 

13 


16 

13 

38 

22 

4B 

35 52 54 

Circle Reasoning 

19 


IS 

10 

32 

25 

50 35 

55 38 

Form Relations 

22 


33 

20 

39 

35 

S3 52 

55 

40 

Form Reasoning 

19 


24 

22 

31 

38 

49 54 

38 40 





TABLE 2 







CFWmOlp MATRIX 

(F) 











Factors 



Variable 

Cor: 

lc 

No 

1 


II 

III 

IV 

V 

Manikin 


l 


438 

— 

435 

“183 

—- 083 

— 069 

Identical Patterns 


2 


,452 

— 

141 

274 

263 

— 200 

Fitting Parts 


3 


335 

— 

,212 

—101 

— 140 

087 

Opposite Sides 


4 


499 


100 

— 163 

— 112 

— 242 

Code 


5 


506 

-- 

297 

— 055 

— 079 

130 

Circle Grouping 


6 


701 


138 

296 

272 

109 

Form Senes 


7 


622 


377 

110 

— 274 

— 117 

Circle Reasoning 


8 


602 


281 

“ 252 

238 

205 

Form Relations 

k 

9 


728 


181 

—154 

177 

— 093 

Form Reasoning 

10 


665 


119 

,1 66 

— 251 

,239 


appealed Two methods of deteimimng the number of factors 
had been tried by the authoi (1) previously with some degiee 
of success One of these, Tuckei’s empirical ciitenon, gave 
negative Jesuits m the picsent case The other, Coombs* 
ciitenon (3) postulates that in a 10-vanable problem, the 
last factor of value will leave a tabic of residuals which, when 
signs aie changed, will contain more than 31 negative entries 
with a standard error of live Table 3 shows the application 
of Coombs' cnteiion to this analysis’ 

191 




] DUCAHONAL AND PSYCHOLOGICAL MEASUREMENT 


Fnctoi 

1 

2 

3 

4 

5 


TABLE 3 
coomds' criterion 


Negatives 

24 

33 

, 2 + 

28 

35 


It was obvious from the numbei of ldatively huge 
lesiduals lemaining in the table aftei the second factor was 
e\tiacted that theie were moie than two factois in the table 
This was home out in the subsequent analysis, which was 
earned to live factois The indication that the fifth factoi 
was the last one of value seems to have been venfied in the 
analysis The standaid deviation of the fifth factoL lesiduals 
bcfoie sign change is 028, which is consideiably smallei than 
the standaid eiioi of a zeio con elation foi a population 
of 286 

Foi the lotation of factois in oidei to secuie bounding 
iiypciplanes, Thuisloiic's method of lengthened vectois was 
used (4), The cuteiia of maximizing the nnmbei of zeios 
and ioteiting to a postulated positive manifold weie the deter- 
mmeis foi diiection of lotation Seven lotations weie neces- 
saiy and a “clean-up 11 lotation with actual length vectois was 
made The iota ted factoi lal matnx is given m Table 4 
The lotational matrix of diiection cosines is given in Table S 
The inteiconelations between the lotated factois are pie- 
seuted in Table 6 


TABLE 4 

ROTATED TACTORIAl MATRIX (iA) 


Vaimble 

Manikin 

Ideniicnl Patterns 
Fitting Pans 
Opposite Sides 
Code 

Circle Giouping 
Foim Scries 
Circle Reasoning 
Foim Relations 
Foim Reasoning 


Code No A li 


1 

582 

075 

2 

092 

547 

3 

345 

— 041 

4 

132 

02 B 

5 

440 

067 

6 

— 076 

436 

7 

— 141 

040 

3 

— 010 

— 046 

9 

076 

162 

10 

080 

021 


192 


FnUar 


C 

D 

E 

004 

192 

054 

— 016 

239 

014 

— 009 

265 

005 

160 

313 

408 

004 

394 

.— 062 

161 

641 

— 073 

016 

639 

453 

561 

518 

026 

415 

507 

244 

— 053 

766 

071 



ANAIYSIS 01- NON-VLRDAL REASONING ILSl 


TABLE 5 

transformation matrix (A) 


Reference Vector 


Centroid Axis 

A 

n 

G 

D 

E 

I 

287 

247 

207 

801 

203 

II 

— 859 - 

■ 243 

361 

248 

379 

III , 

— 380 

671 

— 690 

234 

— 224 

IV 

—18+ 

521 

582 

— 257 

— 436 

V 

013 — 

■399 

111 

421 

— 759 


TABLE 6 






CORREI AHONS m-rWPEN NORMAISTO rill 

PI ANI'S (A' A) 





Plane 



Plil itc 

A 

B 

C 

D 

E 

A 

1 000 





B 

— 084 

1 001 




C 

— 092 

— 241 

1 000 



D 

— Oil 

— 007 

— 009 

1001 


E 

— 127 

— 117 

— 005 

— 003 

1001 


Even a ciusory glance at the lotated niatnx will show 
that the factoiial composition of the tests is not so simple 
as had been hoped foi, 

Factoi M A fi has three vaitables with significant piojcetions 
and all the otheis aie essentially zcio These aie 

1 Manikin , . 58 

3 Filling Pat Is , , 35 

5. Code . . 44 

Eithei one of two InteLpietations could be placed on this 
factor It might be considered to be Space as has been de¬ 
scribed by Thui stone (6), the anthoi (1), and otheis Undei 
this inteipretation it would seem that the giasping of spatial 
relations of the arms and legs of the manikins was of moie 
impoitance than the quick perception of small diffeietices It 
ivould appeal also, that the quick companson of the code 
with the stimuli in the Code test was not so impoitant in 
solving the pioblem as the giasping of the relationship be¬ 
tween the two halves of the individual elements 

The othei mtcrpietation which could be placed on this 
factoi is that it is Peiceptual Speed, oi lather mental aleil- 

193 



EDUCATIONAL AND PSYCIIOlOGICAL MLASURLMENi 

ness, as distinguished fiom Perceptual Disci imination Under 
this inteipietation the ability would involve the quick change 
of icsponse fiom item to item with only the simplest dis¬ 
crimination necessaiy Thmstone's factoi u 9” in his study 
of Hyde Park High School in Chicago seems to have some 
of the chaiacteristics of f.ictoi “A” (5) In this case, the 
test i Scat tend had the highest loading The Manikin 
test has the simplest disci imination level and the Fitting Pails 
test the most complex of those listed The author prefers 
this Litter inteipretation 

Factoi “B” has two tests which have significant loadings 1 

2 Identical Patterns < 55 

6 Cucle Gioupmg ► 44 

It seems obvious that this factoi concsponds to Thur- 
stonc’s (6) Perceptual Speed factor, but we shall call it 
Peiceptual Discrimination to distinguish it from factor “A" 
The difference here is that the emphasis is on analytic per¬ 
ception in which a fine discrimination must be made rather 
than on speedy icsponse to a simple stimulus Speed is of 
impoitance, but in the subjects used the diffeicnces m the 
mental process of peiceptual discrimination will contribute 
moie to peifoimance vailance than will simple speed 

At fust glance it seems surpusing that Cucle Giottping 
is high on this factoL However, a careful subjective analysis 
of the test will indicate that the problems involved are more 
those of perceptual discrimination than of induction The 
figuies are complex but the lules to be brought out aie simple 
Foi example, one of the items has the middle dot blackened 
in a group of three, which is apparent even at a glance, so 
that the problem resolves into finding the coirect gioup m 
the response squaie This takes a dLscnmmatoiy ability 
evidently slightly below that requited for Identical Patterns 

Factor “C” has two variables with significant piojections 1 

8 Cucle Reasoning , 56 

9 Fot m Relations . .... 42 


194 



ANALYSIS OF NON-VEREAL REASONING 1ES1 


Both of these tests are variations of tests used by Thui- 
stone in his studies of the primary mental abilities and have 
been interpreted to contain Induction, 01 Inductive Reasoning 
This interpietntion is suitable in the present case The ap¬ 
parent paiadox that test 8 contains Induction while test 6, 
in which a supposedly similar function is involved, does not 
may be resolved when an inspection is made of the tests them¬ 
selves The primary problem in test 8 is to find a rule by 
which the problem may be solved while in test 6 the main 
problem, as has been said befoie, is to find the response gioup 
rather than the rule 

Factor “D” is an orthogonal factor which was set up by 
making its normal perpendicular to the normals of all the 
other planes This was necessary as one dimension of the 
five-dimensional system could not be identified by a bounding 
hypeiplane because of lack of variables with zero projections 
in that dimension It is the same type of problem as was 
encounteied by the authoi in a foimei study (1) 

All the variables have projections on this factor which 
arc piobably significant The relative amount of projection 
seems to inciease with the greater complexity of the mental 
function involved The tenth test, Fonn Reasoning, which 
involves the synthesis of gcomctncal figuics accoiding to estab¬ 
lished rules (not unlike .mthmetic), has much the highest 
satuiation of the factor, 

The obvious comment, and one that must be reckoned 
with, is that this factor represents “general intelligence,” or 
Spearman’s factor “g ” As has been said before, there is 
nothing in the Thurstone method of analysis which denies 
that such a general factor exists or implies that it would not 
show up if present However, in legard to the nature of the 
present factor, theTe can be little doubt that it is 4 ‘general” 
for this battery of tests and is not an effect of maturation oi 
lack of differentiation of ability due to the youth of the sub¬ 
jects, What it is called— comprehension, understanding, 
mental efficiency, or intelligence —is beside the point Due to 
the popular misconceptions and scientific vagueness of the last 

195 



]'DUCAi IONAL AND PSYCHOLOGICAL MLASU11LMLN1' 

tcirn, it piobably would be bettei to adopt some other name 

It should be undeistood that the nuthoi is of the opinion 
that the above-mentioned effect of an augmented general factor 
due to lack of maturation ts applicable to situations in which 
the subjects aie immatuie, but that such a factor does not 
account foi appreciable distoition in the piesent case It is not 
denied that such a geneial factor is piesent m tests given to 
cluldien, but it seems piobable that the geneial factoi, if it 
exists in such a case, is unduly emphasized by the matin ation 
curves of the abilities 

Another inteipietation which might be placed on factor 
“D" is that it is Deductive Reasoning, which in each test 
lequnes that the subject must base his conclusions oi responses 
on certain facts which aie presented m the test item How¬ 
ever, tins is probably another aspect of the foiegoing 
discussion 

Factoi "E 11 has significant loadings foi two tests and a 
possibly significant loading for a thud 

4 Opposite Sides . 41 

1 Fonn Senes , . .45 

9 Foi m Relations 24 

This factoi apparently coiresponds with none of the fac¬ 
toi s previously identified by Tlnustone and his associates 
ITowevei, it may possibly represent Deductive Reasoning as 
“series 11 tests have been found by Thurstone (5) to contain 
a component of Deductive Reasoning The same is true of 
the foim relations type of test The relationship of the 
Opposite Sides test to such an interpretation is not immediately 
appaient Assuming that one might consider two figures m 
each item of the Opposite Sides test as facts to be compared 
and from which a conclusion might be diawn concerning the 
thud figme, r e , whether it is different from, the fiist two or 
like one of them, then it might be thought to involve Deduc¬ 
tion In the Foim Senes test the symbols presented are facts 
from which a conclusion must be drawn concerning the missing 
figme The conclusion is definitely limited to three alternatives 


196 



analysis of non-vledai ecasoning test 


each of which might be tued m turn In the Foim Relations 
test the problem might be appioached by tiymg to find the 
rule involved, which would be Induction, 01 by substituting the 
possible answeis one at a time and testing the lesulting equa¬ 
tion This lattei piocess might be consideiecl to be Deductive 
Reasoning and msofai as it weie used would cause the test to 
show a loading on the Deduction factoi 

No definite conclusion can be made as to the identity 
of Factoi “E,” but tentatively it may be called Deductive 
Reasoning 

Despite the fact that the factoilal composition of some 
of the tests vanes somewhat fiom what was ongmally sup¬ 
posed, it seems that the tests, as a gioiip, do measure some 
of the highei mental piocesses of Leasoning Fiom amount 
of piojection on the geneial factoi, it would seem that the 
tests satuialed with Peiceptual Speed aie the pooiest measuies 
ol the highei intellective piocesses Ft would appeal that Lest 
number 9, Foim Relations^ which has significant projections 
on thice factois, is piobably the best geneial test of all the 
leasoning piocesses Test 10, Foim Reasoning > is the best 
test of the geneial factoi which might be consideied to be 
synonymous with comprehension m mental efficiency oi intel¬ 
ligence The test, Identical Patterns, seems to be saturated 
with the factor Peiceptual Disciimination, which is intei- 
pieted quite similnily to Thin stone's factoi ol Perceptual 
Speed, and is consistent with Thurstone’s (S) test of Identical 
FonnSf which is paiallel in piocess The test, Cucle Reason¬ 
ing, a variation of Thuistone's (5) Maiks test, is similai in 
factoi lal composition to the lattei The Foim Relations test 
seems to have a heterogeneous factoi lal makeup, as was also 
found by Thu i stone (6) 

The factors identified seem to be consistent with those 
identified by Thmstone (6,5) except foi the general factoi 
It is necessary to investigate these tests in. a laiger batteiy 
befoie an inteipretation can be adequately applied to the gen- 
eta! factoi This factoi has some characteristics similar to 
those found by the authoi (1) in factoi “D” in a “Reanalysis 

197 



EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


of si Test of the Theoiy of Two Factors M The factor Per¬ 
ceptual Speed also seems similar to the factor “C” m the 
lattei study 

The factois have been found to be practically unconelated* 
the highest coi relation, that between factors “B” and ll C> M 
being only 14 degrees off orthogonality This is probably 
within chance variation and no significance is attached to it, 

REFERENCES 

1 Blakey, R I “A Reanalysis of a Test of the Theory of Two 
Factors/ 1 Psycltometnka, II (1940), 121-36 

2, Chesire, L, Snflii, M, and Thurstone, L, b Coviputmg 
Diagrams for the Tetracharic Correlation Coefficient, Chicago 
Umvcisity of Chicago Pi ess, 1933 59 pages 

3 Coombs, Clyde Unpublished paper lead before the American 
Psychological Association, September, 1940, 

4 Thuistone, L L "A New Rotational Method m Factoi Anal¬ 
ysis,” Psychonieirika, III (1938), 199-218 

5 Thurstone, L. L, '‘Experimental Study of Simple Structure,” 
Psychametrrka, H (1940), 153 68 

6 Thurstone, L L Primary Mental Abilities Chicago Uni 
versity of Chicago Press, 1938 121 pages 


198 



NEW TESTS* 


California Capacity Questionnau e, by Elizabeth T Sullivan, 
Willis W Clark, and Ernest W Tiegs 1941 For high 
school and college students, and adults Time, 30 minutes 
Fcnms A and B, 750 pei 25, 250 pei specimen set Pub¬ 
lished by the California Test Buieaii, 3636 Beveily Boule¬ 
vard, Los Angeles, California 


California Test of Personality, by Louis P Thoipe, Willis 
W Claik, and Ernest W Tiegs 1940 One foim each 
foi pnmaiy, elementaly, inteimediate, secondaiy, and 
adult levels Time, about 45 minutes for each senes 
Pi unary series foi giades 1-3 , elementary seiies for giadcs 
4-9, intermediate seiies for giades 7-10; secondaiy senes 
foi giades 9-14, adult senes, $1 00 pci 25 of each senes, 
25$ pei specimen set of each senes Published bv the Cali¬ 
fornia Test Buieau, 3636 Beverly Boulevaid, Los Ange¬ 
les, California 


Cooperative Community Affans Test, by Roy A Price and 
Robert F Steadman 1941 Time, 30 minutes Fonn R, 
$3 50 per 100, 250 per specimen set Published by the 
Cooperative Test Seivice, 15 Amsterdam Avenue, New 
York, New York 


Coopeialive Liteiaiy Comprehension and Appreciation Test, 
by Hyman Eigeiman, Mary Willis, and Fiederick B 
Davis 1941 Foi uppei high school and college classes 
Time, 40 minutes Form R, $4 50 per 100, 250 per speci- 

*Pubhshers and authois af new teals are requested to lend copies to The 
Editor, Educational and Psychological Measurement, Bo* 766, Alexandria, Vfi 

199 



EDUCA1IONAL AND PSYCHOLOGICAL MEASUREMENT 


men set Published by the Cooperative Test Seivice, 15 
Amsteidam Avenue, New York, New York 


Coopa alive Science Test, by John G Zimmerman and Rich- 
aul E Watson 1941 For grades 7, 8, and 9 Time, 
80 minutes Foim R; $5 50 per 100; 25 $ pei specimen 
set, Published by the Coopevative Test Service, 15 Am- 
steidam Avenue, New Yoik, New York 


Coopei alive Social Studies Test } by Agatha Townsend and 
Mary Willis 1941 Foi giades 7, 8, and 9 Time, 80 
minutes Form R, $5 50 pei 100, 25^ pei specimen set 
Published by the Coopeiative Test Service, 15 Amsterdam 
Avenue, New York, New York 


Dunlap Academic Piefeience Blank, by Jack W Dunlap 
1940 Foi giades 7, 8, and 9, Forms A and B, 90^ per 
25, 20^ pei specimen set Published by the Woild Book 
Company, Yonkers, New York 


Eames Eye Test, by Thomas H Eames 1940 $3 50 foi 

examiner^ kit, 65$ pej 25 individual recoid cards Pub¬ 
lished by the Woild Book Company, Yonkers, New York 


Examination foi the Measui ement of the Efficiency of Mental 
Functioning, by Harriet Babcock and Lydia Levy 1940 
One form, set of test materials, $11 20, recoid blanks, 
$2 30 per 25, $6 90 per 100 Published by C H Stoelting 
Company, 424 Noith Homan Avenue, Chicago, Illinois 


Fomfh Glade Geogiaphy Test, by Zoe A Thialls, George 
Millci, and Maiguente Uttlcy 1940, For use at the 
end of the fourth giade Time, 35 minutes One form, 
8^ per test, 4^ pci manual, 20^ pei sconng stencil Pub 
lished by McKnight and McKnight, Bloomington, Illinois 

200 



NEW TESTS 


Hills Economics Test , by John R Hills 1940 Foi high 
school and college students Time, 40 mniutes One foim, 
500 per 25, 150 pei specimen set Published by Biueau 
of Educational Measuiements, Kansas State Teacheis 
College, Empona, Kansas 


Kansas Vocabulary Test, by PI. E Sciuammel, 0 M. Ras¬ 
mussen, Anna Huebeit, and D J Tate 1940 Foi grades 
4 to 8 Foims A and B, 40^ per 25 , 150 pel specimen set 
Published by Biueau of Educational Measuiements, Kan* 
sas State Teachers College, Empona, Kansas 


Kirkpatrick Chemist)y Testy by Einest Klikpatnck 1940 
For high school students Time, 40 minutes One foim, 
600 pei 25, 150 pei specimen set, Published by Buieau 
of Educational Measurements, Kansas State Teachcis 
College, Empona, Kansas 


Kmss World History Test, by F Roscoe Kniss 1940 Foi 
high school students Time, 50 minutes Foims A and B, 
$1 30 pei 25c, 200 per specimen set Published by the 
World Book Company, Yonkers, New Yoik 


Mechanical Compiehenston Test, by Geoige K Bennett 
1940 For male high school students and adults Time, 
about 25 minutes One form, $2 50 pei 25 booklets and 
answer sheets, 250 pei specimen set Published by the 
Psychological Coiporation, 522 Fifth Avenue, New Yoik, 
New York 


Minnesota Personality Scale, by John G Dailey and Walter 
J McNamara 1941 Foi upper high school and college 
students Time, about 45 minutes Separate question 

201 



LDUCA1I0NAI AND 1’SYClIOI0G1CAL MLASURLMENT 


booklets foi men find women, answci sheet can be used 
with cithei question booklet, scorable by International 
Test Scoring Machine; $1 50 pei 25 question booklets, 
754 pei 25 answci sheets, 354 pei specimen set Published 
by the Psychological Coipoiation, 522 Fifth Avenue, New 
Yoik, New York 


Motdy-Schuimincl American Government Test, by F E, 
Moidy and H E Schiammel 1940 Foi high school and 
college students Time, 40 minutes One foim, 504 per 
25, 154 per specimen set Published by Buieau of Edu¬ 
cational Measulements, Kansas State Teacheis College, 
Empoua, Kansas 


Mouty-Schiavnnel Constitution Test, by F E Mordy and 
PI E Schiammel 1940, Foi high school and college 
students Time, 40 minutes One form, 504 pei 25, 154 
pei specimen set Published by the Bureau of Educational 
Measuicments, Kansas State Teacheis College, Emporia, 
Kansas 


Peabody Libuiiy Information Test, by Louis Shores and Jo¬ 
seph E Moore 1940 One form each for college, high 
school, and eJementaiy school levels Time, 30 minutes 
College level one foim, $1 00 per 25 High school level 
one form, 754 pei 25 Elementary school level one form, 
604 per 25, 204 pei specimen set Published by the Edu¬ 
cational Test Biueau, 720 Washington Avenue, S,E, 
Minneapolis, Minnesota, 


Rasmussen Trigonometry Test, by O M Rasmussen and O 
J Peterson 1940 Foi high school and college students 
Time, 40 minutes, One form, 504 pei 25; 154 per speci¬ 
men set Published by Bureau of Educational Measure¬ 
ments, Kansas State Teachers College, Emporia, Kansas 

202 



NEW IES1S 


Staufoid Achievement Test, by Tiuman L KeJJcy, Lewis M 
Teiman, and Giles M Rnch 1941 Foims D and E for 
each of pumaiy, tnteimediate, and advanced levels fiom 
grades 2 to 9 Pumaiy Batteiy, foi giades 2 and 3 
time, 50 minutes, $1 10 pei 25, 20(2 per specimen set 
Intcimediate Batteiy—Complete, foi giades 4 to 6 time, 
150 minutes, $2 00 pei 25, 40^ pei specimen set Ad¬ 
vanced Batteiy—Complete, toi glades 7 to 9 time, 150 
minutes, $2 00 pei 25, 400 pei specimen set Published 
by the Woild Book Company, Yonkers, New Yoik 


Tate Economic Geogiaphy Test, by D J Tate and G A 
Buzzaid 1940 Foi high school and college students 
Time, 50 minutes Forms A and B, 500 per 25, 150 per 
specimen set Published by Bureau of Educational Meas¬ 
urements, Kansas State Teacheis College, Empona, 
Kansas 


Ti usle) ~Ai nett Health Knowledge Test, by Y T Tiusler, 
C E Ainett, Jr, and H E Schrammel 1940 For 
grades 9 to 12 and college Time, 50 minutes Forms A 
and B, 500 per 25, 150 pei specimen set Published by 
Bureau of Educational Measurements, Kansas State 
Teachers College, Emporia, Kansas 


Turse Shorthand Aptitude Test, by Paul L Turse 1940 For 
use with high school students before enrolling in short¬ 
hand courses Time, 45 minutes One form, $1 30 per 25, 
100 pei specimen set Published by the World Book Com¬ 
pany, Yonkers, New York 


Vocational Inventory, by Curtis G Gentry 1940, For high 
school and college students, and adults Time, about 150 
minutes One form, 150 for vocational inventory, indi- 


203 



EDUCAIIONAL AND PSYCIIOI OGICAL MEASUREMENT 

vicinal analysis lepoit, ant] individual scene tabulation 
sheet, 25^ pei sample set Published by the Educational 
Test Biueau, 720 Washington Avenue, S E , Minneapolis, 
Minnesota 


204 



MEASUREMENT ABSTRACTS* 


Adkins, Dorothy C and Kcrdei, G Frederic “The Relation 
of Pumary Mental Abilities to Activity Piefeiences" 
Psychoinetuka f V (1940), 251-62 

The relations of abilities, as measuied by Thutstone's 
Tests for Pumary Mental Abilities, to activity piefeiences, as 
measured by Kuder's Preference Record, are investigated for 
a population of 512 university freshmen Ability piofiles for 
contrasted groups on each preference scale leveal relatively 
slight ovei Japping between the two sets of measures, although 
the apparent tiends are leasonable, The Pearson inteicoi- 
relation coefficients of all pairs of measures involved were 
determined Implications of the findings in relation to theory 
and to educational and vocational guidance aie indicated 
(Courtesy Psychomelrika ) 


Allison, G and Bainett, A “Freshman Psychological Exam¬ 
ination Scoies as Related to Size of High Schools '* J(wi¬ 
nd of Applied Psychology, XXIV (1940), 651-52 

Quantitative and linguistic scores of 1,083 college fiesh- 
men on the 1938 edition of the A,C E Test weie analyzed 
with lefeience to the size of the high schools fiom which 
they giaduated For thiee size-groups, statistically significant 
differences in means were found in five of six comparisons 
Means tend to mciease with eniollment but theie is much 
oveilapplng W A V aivel 


Andeison, Ii A and Traxlei, A E “The Reliability of the 
Reading of an English Essay Test,” Part II School 
Review , XLVIII (1940), 521-30 


^Edited by Professor Fojrest A Kingsbury 


205 



EDUCATIONAL AND PSYC1IOIOG1CAI MEASUREMENT 


Factual notes wcie piepaied on two themes “The Dis- 
coveiy ol Gold m California 1 ’ (Foim A), and “The Pony 
Expicss 11 (Foim B), A gioup of 281 pupils in the Umveisity 
High School of the Umveisity of Chicago were given the 
two foims at one yeaFs inteiv.il with instiactions ta expand 
the mateual into a two-houi essay The essays were graded 
on a sixty-point scale with the following weights foi the sep¬ 
arate factois completeness (6), spelling (6), punctuation 
(6), language enois (6), coherence between main divisions 
(10), organisation of paiagiaphs (10), and organization of 
essay sentences (10) On leieachng 70 essays of each foim 
the giades of a skilled leadei showed conelations of 893 ± 
016 and 937 ± 010 foi the two foims, two leadeis, on 
fust sconng of 25 papeis, showed conelations of 859 ± 035 
and 898 ±: 026 foi the two foims Foi individual factors, 
no con elation was below 80 Giowth in language ability may 
be indicated by an aveiage gam of 3 3 points fiom Foim A 
to Foim B foi 281 pupils The lesults me not deemed con¬ 
clusive. but only suggestive of the desuability of expeiorienta¬ 
tion with essay-test procedmes J E Karfm 

Babitz, Milton and Keys, Noel “A Method for Approximat¬ 
ing the Aveiage fntei-Conelation Coefficient by Corre¬ 
lating the Paits with the Sum of the Tarts 11 Psycho- 
melnka V (1940), 283-88 

It is noted that the aveiage inter-item con elation, which 
represents the internal consistency of a test, yields a unique 
estimate of test reliability A close approximation to this 
average is given by a formula which requires the correlation 
of each item with the total score and the standard deviation 
of each item The foimula is especially useful in those in¬ 
stances wheie the nurnbu of items is small and wheie the 
variation in item sigmas should not be neglected (Couitesy 
Psychomeinka ) __________ 

Benton, A L and Peny, J D “A Study of the Piedictwe 
Value of the Stanford Scientific Aptitude Test (Zyve) 11 
Journal of Psychology, X (1940), 309-12 
Scores on the Stanford Scientific Aptitude Test and the 

206 



MEASUREMENT ABS1 RACES 


ACE Psychological Examination (1934-35) togethei with 
course giades foi 43 students ovei a peiiod of thiec to fom 
yeais weie used in an investigation of the pieclictive value of 
the Aptitude Test The aveiagc scoie on the ACE was 
approximately one sigma above the mean foi the 1935 fiesh- 
men Correlations of couise giades foi scientific and non- 
scientific courses with the Aptitude Test and the A,C E Test 
weie about + 35, theie being no significant difference between 
the sets of coirelations The coefficients of conelation of 
the 11 snbtests and aveiage giades in all college couiscs “were 
all quite low " The authors suggest “that the test has a cer¬ 
tain limited value in prognosticating the scholastic achievement 
of freshman and sophomore students ” Harold Bechtoldi 

Buios, Oscar Kusen, Editor The Nineteen Foity Mental 
Measttt emails‘Yeai book Highland Park, N J The 
Mental Measurements Yearbook, pp 674 4* xxxm 1941 

The first part of the Y eat hook contains leviews of new 
tests as well as of selected oldei tests There aie 524 tests 
listed Most of these aie reviewed by two oi three reviewers 
The second pait lists 368 books and pamphlets in the measure¬ 
ment field and excelpts fiom leviews of them which have 
been published in various journals 

Cast, B M D “The Efficiency of Different Methods of 
Marking English Composition ” Pait II Buiish Journal 
of Educational Psychology f X (1940), 49-60 
Forty English compositions weie marked by 12 examiners 
by foui different methods (1) the examiner's own habitual 
method, (2) the method of general impression; (3) Burt’s 
analytic method (allotting separate marks foi specified points 
or qualities), (4) Hal tog's achievement method The P-tech- 
nique (correlation of peisons) was combined with Burt’s 
summation method foL a factonal analysis of the conclations 
between examineis, this lesulted in (a) a geneial factoi (lep- 
lesenting the best approximation to the "tine marks”) account¬ 
ing foi 50 per cent of the variance; (b) a dichotomous factor 
of examiners marking better by analytic methods oi by intu- 

207 



EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 

itive 01 lmpiessiomstic methods The methods of maikmg foi 
gcncial use are heie found to be in oidei of piefeience the 
“analytic” method, the method of geneial mnpiession, the 
examinees habitual method, and Hartog’s achievement 
method J E Kaihn 

Daniel, C “Statistically Significant Diffeiences in Obseived 
Pci Cents” Journal of Applied Psychology, XXIV 
(1940), 826-30, 

A table gives the amount by which a pei cent A obseived 
in one sample must exceed a pel cent B observed in another 
sample of the same size to be significant at the 0 05 level 
It is presented for diffeient values of B and foi samples from 
20 to 1,000 Vauons conditions ate stated and the meaning 
and use of the table discussed IV A Varvel 

Davis, F B “The Intcipictation of I Q's Derived fiom the 
1937 Revision of the Stanfoid-Binet Scales” Journal of 
Applied Psychology, XXIV (1940), 595-604 
The author piesents a table of equivalent values foi 1 Q’s 
fiom the 1916 and 1937 revisions of the Stanfoid-Binet and 
suggests a new classification of I Q’s based on the 1937 foim 
The method by which the equivalency was calculated is dis¬ 
cussed The suggested classification of I Q’s provides a senes 
of equal steps oi gradations of brightness, IF A Farvel 

Dongan, K E andGoiy, A E “Selecting Unskilled Laborers 
in Cincinnati ” Public Pei sound Review, I, No 3 (1940), 
43-50 

Job analyses weie made of jobs foi. unskilled laboiers 
as eligible lists became needed It was agieed that the ability 
to lead and write, a good physique, intelligence, expenence, 
and an age lange of 21 to 45 or 50 weie lequiied foi the jobs 
An examination foi waste collectoi included a piactical 
test calling for lepeating a demonstiation given by legular 
woikers, an evaluation of tiaining and expenence, and an oral 
inteiview A test foi stieet cleaneis was composed of 75 
multiple-choice items on arithmetic, vocabulaiy, ieasonmg, and 

208 



MEASUREMENT ABSTRACTS 


general mfoimation These questions weie put in the language 
of laborers 

Examining foi unskilled laboi positions has gone on only 
since Febiuary, 1940 The departments, howevei, believe 
they aie getting bettei woikers 

Diessel, Paul L “Some Remarks on the Kudei-RichaLdson 
Reliability Coefficient” Psychometnka, V (1940), 305- 
10 

The Kudei-Richardson leliability coefficient is derived in 
a mannei independent of that originally given, Vauous altei- 
native foims applicable to special situations aie exhibited 
with the pin pose of making them available to otheis intei estcd 
m using this formula A simplification m computation is sug¬ 
gested for use with a calculating machine (Couitesy 
Psycho metuka ) 

Feiguson, Geoige A “The Application of Sheppaid's Col¬ 
lection for Gioupmg” Psychometnka, VI (1941), 21-7 
This papei attempts to show in a non-mathematical way 
the influence of grouping on standaid deviations and couela- 
tions, and advances empuical evidence to illustiate with what 
accuracy values collected for gioupmg by Sheppaid's collec¬ 
tion approximate values obtained fiom ungiouped data when 
the distributions are continuous This lnquuy gained its initial 
stimulus from the obseivation that many standaid deviations 
and conelations lepoited by students of psychology and edu¬ 
cation are uncollected foi gioupmg and that fiequently enois 
attiibuted to the gioupmg of data are not small when com- 
paied with errors of sampling (Courtesy Psychometnka ) 

Godard, R FL and Lindquist, E F “An Empirical Study of 
the Effect of Heterogeneous Within-Groups Variance upon 
Ceitam F-Tests of Significance in Analysis of Variance ” 
Psychometnka, V (1940), 263-74 

In the application of the analysis of vailance to data 
obtained in educational methods expeilments which involve 
several classes of several schools, one assumption is that of 
homogeneity m the variances of pupil scores from school to 

209 



1 DUCA1I0NAL AND PSYCHOLOGICAL MEASUREMENT 

school It is shown that such vauances on lepiesentative 
educational achievement tests me hetciogeneous The effects 
of this hcteiogeneity upon the F-tcsts of significance commonly 
employed in methods expenments are Investigated by com¬ 
piling the actual distilbution of F values foi a laige number 
of "expenments” involving mmked heteiogeneity wLth a 
theoietical distiibntion based on the assumption of homo¬ 
geneity Although the findings, which vaiy somewhat with the 
type of vanance latio, aie not entuely conclusive, they appai- 
ently demonstiate that depaituie horn homogeneity does not 
invalidate the use of the custommy F-tests foi evaluating 
results of the typical methods expeument (Courtesy 
Psychomcluka ) 

Goodenough, Floience L and Mauiei, Katharine M "The 
Relative Potency of the Nmseiy School and the Statis¬ 
tical Laboiatoiy in Boosting the IQ 11 Journal of Educa¬ 
tional Psychology, XXXI (1940), 541-49 
This study recomputed data obtained at the Minnesota 
Nuiseiy School by those statistical piocedmes generally 
employed in the Iowa statistical laboratory In the Iowa 
procedure, cases were giouped accoiding to initial I Q, instead 
of paternal occupation This recomputation of data, which 
when handled propeily showed no effect of nursery school 
training upon the I Q , gave lesults smulai to those repoited 
from Iowa A diffeience in IQ appealed foi chddien who 
remained at home as well as foi nuiseiy school chddien The 
authors conclude that the previously repoited differences are 
the result of fallacious statistical treatment lather than being 
an educational phenomenon D A Peterson 

Guilfoid, J P "The Phi Coefficient and Chi Squaie as Indices 
of Item Validity.” Psyckomeinka, VI (1941), 11-9 
Two new methods of item analysis are descubed One 
involves the computation of the 0 coefficient (conelation of 
a fourfold point distiibution) and the othei involves chi 
square The only data required aie the proportions of passing 
individuals in the upper and lower criterion groups, for the 


210 



MEASUREMENT ABSTRACTS 


determination of 0, and in addition, N, foi the deteirmnation 
of chi squaie Abacs aie piesented foi giaphic solution of the 
two indices of validity, and tests of significance aie piovided 
(Couitesy Psychomctnka ) 


Jenkins, R L “Consideiations Relative to the Selection of an 
Index of Intelligence 11 Journal of Educational Psychol¬ 
ogy, XXXI (1940), 527-40 

The test-ietest stability of the I Q and the P C (Heinis 
personal constant) aie compaied m terms of Binet test Latings 
for 1,774 cases The group was weighted with letaided 
childien Compai isons of all adjacent tests weie made 
Regression of both IQ’s and PC’s toward the mean on 
letest was found with maiked diops in the P C,'s of very 
bright childien “The PC, appeals to offer no advantage 
ovei the I Q foi the childien of the middle-age gioup " 
and appeals to be slightly mfei 101 to the I Q at the lower 
age levels 

The rationale undeilying the two statistics aie considered 
Dispeisions in intelligence aie assumed in both cases to be 
piopoitional to the mental age The growth function assumed 
by the I Q and the PC aie presented with the point that 
both ciuves have one degiee of fieedom 

It is suggested that a moie logical appioach would be to 
express “mental test peifoimance in terms of the sigma value 
of test score foi the chronological age ” The assumption is 
less restrictive than those foi the constancy of the IQ oi 
P C ; the assumption is “that the relative status of children 
with respect to intelligence remains constant,” which is 
“implicit in the use of any index of intelligence foi piedictive 
purposes ” This index avoids the logical fallacy involved in 
adult mental ages It is pointed out that the giowth function 
may be a two-parameter cmve which would not inteifeie with 
the use of sigma values, but would l educe the value of a single 
paiameter statistic Harold Bechtoldt 

211 



EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 

Page, J D “The Effect of Nuiseiy-School Attendance Upon 
Subsequent IQ n Journal of Psychology, X (1940), 
221-30 

Stnnfoid-Bmet I Q *s of 72 childien in kindcigniten to the 
fifth giade who had pieviously attended niuseiy school 125 
to 525 days weie compaLcd with those of adjacent oldei sib¬ 
lings who had not attended pieschool. One bundled children 
of like age and socio-economic status wcie also compaied with 
then adjacent oldei siblings, none of eithei group having 
attended pieschool No significant diffeiences in IQ could 
be inferred to nuiseiy-school attendance A slight advantage 
of youngei siblings in both expeiimental and contiol gioups 
was explained by age fluctuations in the standaidilation, of 
the L foim of the Stanfoul-Binct No relation was found 
between dilution of nuiseiy-school attendance and subsequent 
IQ advantage The mean I.Q, chfteience between sibling 
pairs appioximated 10 points JV A Vaivel 

Powell, N J “Check List foi Use in Civil Seivice Objective 
Test Piepaiation u Public Petsonnel Quai teily, II (1940- 
41), 13-6 

The article includes a list of questions which have been 
developed foi leviewing civil service objective tests before 
they aie finally used Its use is intended to "inciease the prob¬ 
ability that no majoi basis upon winch the test will be 
appraised has been ignored in the test construction 11 Points 
to be checked are listed under the following headings validity, 
cost, appearance of test, typography, and administration A 
numbei of questions applying specifically to completion items 
and multiple-choice items are also listed 

Rolf, Merrill “Lineal Dependence in Multiple Con elation 
Work " Psycho me tnka, V (1940), 295-98 
The problem in multiple correlation work of nonsense 
lesults attributable to linear dependence of vanables, which 
has been discussed by Ragnar Frisch in relation to economic 
data, is presented from the standpoint of its significance in 
psychological research It is shown that a symmetric coite- 

212 



MEASUREMENT ABSTRACTS 


lation determinant with unity in the diagonal cells can vanish 
only when theie is a fiist-oidei 01 paitial con elation of unity 
between one pan of the vaiiables On the basis of this lesult, 
it is aigued that the pioblem should be expected to cause less 
difficulty in the field of psychology than in economics and that 
psychologists should be able to avoid the pitfall by bunging 
to beai then knowledge of the vaiiables with which they are 
working (Couitesy Psychometnka ) 


Royei, Elmei B “A Machine Method for Computing the 
Bisenal Coi relation Coefficient in Item Validation" 
Psychometnka, VI (1941), 55-9 

A method foi computing the bisenal con elation coefficient 
with the aid of punch-caid equipment is outlined A numer¬ 
ical example and a woik sheet layout are included in the 
piesentation (Couitesy Psychometnka.) 

Ryans, David G The Fust Step in Guidance Self-Appraisal 
New York, Coopeiative Test Service 35 pp 1941 
Aiepoit of the 1940 Sophomoie testing piogram in which 
the following tests were used Coopeiative English Test, Form 
Q, Coopeiative Geneial Culture Test, Foim Q, and Coopeia¬ 
tive Contempoiary Affaiis Test, Foim 1940 


Sisk, H L “A Note on the Compaiative Value of the ‘True* 
Index of Studiousness foi the Puipose of Prognosis 11 
Journal of Psychology t X (1940), 275-78 
The scholastic achievement of 585 umveisity fieshmen was 
predicted from Symond’s “true” Index of Studiousness and 
from a battery of tests, composed of aptitude, English, and 
leading The latter was found to give a moie leliable pie- 
diction of first semestei grades W A V(nvcl 


Stoy, E G “Selection of Key-Punch Opei atois ” Journal of 
Apphed Psychology } XXIV (1940), 653-54 
These aie notes on preliminary experimentation in the 
selection of key-punch operators Fom tests warrant further 

213 



LDUCATIONAT AND FSYCUOLOGICAL MEASUREMENT 

consideiation ‘‘an eye-hand cooidination test m which lettei 
combinations involving both hands aie registered on counters, 
a test of veibal and spatial memoiy, a clencal type of test, and 
an authmctic test" IF d Vaivcl 

Swinefoid, Fiances and Holzangei, Kail J “Selected Refer- 
ences on Statistics, the Theoiy of Test Constiuction, and 
Factor Analysis” School Review, XLVIII (1940), 
460-66 

Aiticles coveiing the year Maich, 1939, to February, 
1940, aie piesented with buef notes as to the nature of the 
pioblem handled in each papei Twelve aitides aie given 
under the heading “Theoiy and Use of Statistical Mehods," 
18 undei “Pioblems of Test Constiuction,” and 16 under 
“Factoi Analysis." Hat old Bechtoldt 

Thmstone, L L “A Factonal Study of Visual Gestalt 
Effects" Psychomeluka, V (1940), 315-16 (Abstract 
of a papei lead at the September, 1940, meeting of the 
Ameucan Psychological Association ) 

Toolon, W T “EssentiaL Factois m Test Construction" 
Pet sound Journal, XVIV (1940), 204-08 
The value of caiefvd “informal examination" of test items 
befoie and aftei statistical tieatment is pointed out, and an 
analysis of the natuic of the items and of the euors made 
is suggested Factois dealt with include item difficulty, item 
conditions, doseness of distiactors, and the judgment and 
information of the subject Hat old Bechtoldt 

Tucker, Ledyard R “A Matnx Multipliei " Psychometitka, 
V (1940), 289-94 

A machine to expedite matux multiplication has been 
developed by modifying the International Business Machines 
Corporation scoung machine The principles and operation 
of the machine aie described, and time and accuracy estimates 
aLe indicated (Couitesy Psychometnka ) 

214 



EDUCATIONAL AND PSYCHOLOGICAL 

MEASUREMENT 


Volume I JULY, 1941 Number 3 


A New Performance Test for Young Deaf Children 217 
Marshall S Hiskey 

Performance Testing in Public Personnel Selection 233 

Sidney IV Koran 

Some Data on the Kuder Preference Record 253 

Arthur E Trdxlei and William C McCall 

The Reliability or Ratio Scores 269 

Lee J Cronbach 

Guiding Students to Become Self-Guiding 279 

Joseph S Kopas 

An Attempt to Measure Scientific Thinking 289 

Max D Engelliait and Hugh B Lewis 

An Evaluation of Techniques of Measuring Visual Acuity 
at the College Level 295 

Frances Oiahnd Triggs and Karl E San/lt 

The Concept of Scatter in the Light of Mental Test 
Theory > ,303 

Mattrtce Lon and Ralph K Metster 

Measurement Abstracts , 311 


Measurement News 


318 



Copyrights 1D4I| by 

SCTENCF RESEARCH ASSOCIATES 


MINTED IN THE UNITED STATES OT AMERICA 



A NEW PERFORMANCE TEST FOR YOUNG 

DEAF CHILDREN 

MARSHALLS HISKEY 1 

University of Nebraska 

Inti oduction 

T HERE has long been a need for a measuring device 
which would give the teachei of the very young deaf 
child a valid indication of his learning level at the begin¬ 
ning of his educational career One can find a consider¬ 
able numbei of moie or less carefully worked out mental 
tests which have been used for deaf and hard-of-hearing 
individuals However, the degiee of help which such tests 
lender the educatoi or clinician depends upon the num¬ 
ber and lepresentativeness of the children upon whom 
they have been standardized, and also upon the reliability 
and amount of mfoimation about the children which the 
test makes available Few tests have been standardized 
on deaf children or used with such children at the begin¬ 
ning of their school experience 

Instruction, especially at the lower levels, although 
carried on as a group activity, actually involves consider¬ 
able individualized work This individualization is, in 
many instances, primarily a means of prepaung foi group 
instruction Theiefoie, if classes are not composed of 
students of approximately the same level of ability, the 

^he writer wishes to acknowledge hia indebtedness to Dr DA Worcester 
and other staff members of the Deportment of Educational Psychology and Mena- 
uiemcnts of the University of Nebraska nnd to the administrations of the Iowa, 
Nebraska, Kansas, Missouu, Illinois, Indiana, and Ohio Slate Schools for the 
Deaf 


217 



educational and psychological measurement 


teachei must spend entirely too much of her time working 
with the slower pupils as individuals In many instances 
this is done at the expense of the more capable students 
and often results in a gieat waste of time since it is diffi¬ 
cult to keep the young deaf child occupied constiuctively 
without the dncct, and almosL constant, guidance and 
supeivision of the teachei, If supplementary measuring 
devices aie valuable m making the school program more 
effective for the heaung child, then they should be even 
moie valuable with a gioup who must start with the 
handicap of deafness 

Difficulties involved m consti noting a test for the 
young deaf and hai d-of-heai mg , In the selection of Ltems 
for young deaf children, the special limitations of this 
group must be kept constantly in mind The actual test¬ 
ing of deaf childlen piesents problems which are unique 
Practically eveiy impression of the test materials gained 
by the deaf child must be thiough the sense of sight All 
institutions must be given through pantomime Because 
of the child’s complete lack of language expenence, the 
test items must have an unusual intrinsic attractiveness 
In addition to these problems one must devise a sufficient 
variety of items to sample adequately the abilities of indi¬ 
viduals whose range of experiences has been seriously 
restricted 

To attempt to obtain a rating of the “word fluency” of 
the child who has been deaf since birth would be futile 
Nor does it seem appropriate to include speed tests since 
it is very difficult to give to the young deaf child the con¬ 
cept of speed 

Based on the observations gained through testing the 
members of both groups, the writer is of the opinion that 
deaf subjects aie more prone to “jump to conclusions” 
and to overestimate their abilities or the amount of mate¬ 
rial which they have grasped, than are hearing subjects 
It is necessary to make them take their allotted time for 
viewing materials before they attempt a response On the 

218 



PERI-ORMANCE TEST FOR DEAF CHILDREN 


other hand, the examiner must always be on the alert, lest 
through some slight change in facial expiession he assist 
the subject in making his response The deaf or hard-of- 
heanng child is continuously seeking visual clues and an 
“arched eyebrow” or the “fhckei of an eyelash” may 
speak volumes to him 

The wntei has made no attempt to compare the intel¬ 
lectual development of deaf and hard-of-hearing children 
with that of hearing children The deaf child’s training 
probably will never be identical with that of the hearing 
child. The writer is of the opinion that the question of 
primary importance is not, “How does the deaf child rank 
in comparison with the hearing childP”, but rather, 
“How does the deaf child rank in comparison with other 
deaf children of his chronological age?” 

Development and Standardization of the Scale 

Preliminary study of deaf and hard-of-hearing chip 
dten in school In oidei to obtain a moie adequate under¬ 
standing of the gioup, the writer made an intensive study 
of deaf children as they actually went about then school 
work 

For a penod of more than four months the writer 
spent three days every two weeks with these pupils m a 
residence school for the deaf Not only did he visit them 
at their class work but he lived with them at the school 
and associated with them on the playground, in the gym¬ 
nasium, and elsewhere, A complete record was made of 
the activities which took place in the classroom and also 
of those of an extia-curncular nature This type of study 
yielded a multitude of suggestions which weie of the 
utmost importance in the constmction of the scale 

The selection and construction of test items Every 
item of the scale was considered m light of the following 
criteria: (1) Was the item similar to the task, 01 tasks, 
which the young deaf child did in school? (2) Was it the 
type of item which could be included in a non-verbal 
test? (3) Could the item be presented in such away that 

219 



LDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


duections could be given thiough simple pantomime? 
(4) Was it the type of item which experience had shown 
to yield high cori elation with acceptable critena of intel¬ 
ligence or learning ability? (5) Could the item be con¬ 
st! ucted and presented in such a way that the child could 
give a definite response, thus making the scoung objective 
and easily done? (6) Would the item be appealing oi 
attiactive to the subject? (7) Could the item be scoLed 
without the scoie being based on time? (8) Did the diffi¬ 
culty of the item appear to be within the age lange of the 
standatdizing group? (9) Did the item seem likely to 
show a high discriminative capacity? 

In many instances, in order to meet all the above 
criteria, it was necessaiy to devise special methods of 
constructing or assembling the paits of an item The pre¬ 
liminary scale was composed of 18 diffeient types of items 
with a total of 204 items 

The use of the piehminaty scale This scale was given 
to seventy-thiee pupils of the Iowa School for the Deaf, 
whose ages langed fiom three years ten months to nine 
years eight months Owing to the length of the scale, it 
was divided into two paits and half of the group was 
given Part A first and the othei half of the group was 
given Part B fiist The two parts were given not less 
than one day noi more than one week part In several 
instances items weie scored in detail, thus permitting a 
later rescoring on a different basis 

AfteL members of this tryout group were tested, an 
item analysis was made and curves of the percentage 
passing each successive chronological age were plotted, 
This was done for each of the 204 individual items of the 
scale The steepness of these cuives afforded a graphic 
indication of the validity of the items, The items which 
appeared to function the most satisfactorily and to most 
neatly approximate the criteria were retained, The cri- 
teua used weie (1) validity (based on the percentage 
passing from one age to the next) , (2) ease of adminis- 

220 



PERFORMANCE TEST FOR DEAF CHILDREN 


tering, (3) ease and objectivity of scoring, (4) attractive¬ 
ness or inteiest to the subject, (5) variety; and (6) time 
of administering When the sifting process was com¬ 
pleted, 11 types of tests weie retained, including a total 
of 124 individual items 

The test items A buef description of the items may 
make latei discussions more meaningful The types aie 
as follows 

1 Memory for Colored Objects—Two sets of eight coloied sticks 
each, one set foi the examinei and one set for the subject The 
examiner presents from one to five of tire sticks from his group and 
then removes them and the subject must select the corresponding 
sticks from lus set fiom memory, 

2 Bead Stnnging—At the lowci levels sconng is based on the number 
of beads strung duiing a two-minute period The intermediate 
level demands the coueet copying of bead patterns* while at the 
upper level the subject ts lated on lus ability to repioduce patterns 
from memoiy 

3 Pictorial Associations—This includes 12 series of pictuies, In 
each senes two pictmes are mounted side by side and a recess is 
left for the insertion of the thud picture which is associated with the 
first two This third picture must be selected from a group of 
four unmounted pictures (There are four unmounted pictures foi 
each series ) 

4 Block Patterns—A set of eight drawings of block patterns and 16 
blocks The patterns are arranged in ordci of difficulty and the 
subject must construct the pattern shown in the diawing, 

5 Memory for Digits—Two sets of nine numbers each The exam¬ 
iner presents a number scries and the subject must repioduce it from 
memory 

6 Completion of Drawings—A senes of 15 pictures, each with a part 
missing, The subject must draw the missing part and thus com¬ 
plete the picture 

7 Pictorial Identification—Six series of mounted pictures Each series 
has five pictures of a similar nature which are mounted side by 
side Four individual pictures which are duplicates of the mounted 
pictures must be correctly identified by matching them with the 
corresponding mounted pictuie 

8 Paper Folding—Six-inch squares of paper which must be folded 
by the subject to duplicate (seven) patterns 

221 



EDUCATIONS! AND PSYCHOLOGICAL MEASUREMENT 


9 Visual Attention Span—Several senes of pictures (varying from 
one to six pictuics each) and 15 individual pictures The subject 
is shown a pictute senes and lie must use the individual pictures to 
lepioduce the presented series from memoiy 

10 Puzzle Blocks—Eight sets of variously shaped pieces of wood Each 
set can be put together to form a block 

11 Pictonal Analogies—Ten senes of pictuics with three pictures m 
each senes mounted and font pictures to use as choices The first, 
second, and third pictuics of the analogy aie mounted and a recess 
is left for the insertion of the fomth picture which completes the 
analogy The subject must select the latter picture fiam among the 
foin available choices, 

Use of the p) ovisional scale In addition to the work 
d,one with the pupils of the Iowa school, the test was 
administered m the state schools foi the deaf in Nebraska, 
Kansas, Missouri, Illinois, Indiana, and Ohio, as well as 
to the members of the Lincoln, Nebraska, Day School 
All students, except a few who were ill, who were under 
10 or who had had their tenth bnthday within 15 days 
of the examination date were tested The test was admin¬ 
istered to 466 individuals, The standardizing group is 
limited in numbeis at the age of four and below, since 
most schools do not accept childien until they are five or 
six years of age 

Derivation of the final scale , To save time and to 
guarantee greater accuracy in the statistical data on which 
the final selection of items would be based, Hollerith 
techniques were used By means of the Hollerith sorter 
and counter it was possible to determine quickly the num¬ 
ber of individuals who were successful on each item in 
successive ages throughout the iange and thus to plot for 
each item the curves of percentage passing Items were 
selected chiefly on the basis of discriminative ability, this 
judgment being based on the increase in percentage pass¬ 
ing from one age to the next The items in each group 
were next arranged in order of difficulty, this order being 
based on the percentage of the total group passing each 
individual item 


222 



performance test for deaf children 


To develop the table of norms, curves were plotted 
showing for each age gioup the percentages making each 
possible total score foi each group of items The score 
necessary for passing each item at a certain age level was 
considered to be that score which was made by approxi¬ 
mately 70 per cent of the particular gioup In all in¬ 
stances the peicentages were plotted, the curves were 
smoothed, and the ends were extended to obtain what 
might he termed “ptojected norms” at the extremes This 
smoothing of the curves of percentages gives a somewhat 
truer indication of the ability level of the four-year-old 
group 

To determine who should compose the four-year-old 
group, 01 the five-year-old group, etc,, it was decided to 
classify all individuals as four whose ages were between 
three years six months and four years five months and as 
five those who were between four years six months and 
five years five months, and so on, In no instance does the 
mean chronological age of the standardizing group devi¬ 
ate more than one month from the desired or true mean. 

The unit of measurement Peihaps the most common 
method of interpreting scores on a scale such as this one 
is the familiar Binet type mental age This is the method 
of using age norms and the amount of mental develop¬ 
ment in a year as the unit of measurement. Age norms 
are established for raw scores and are converted into, or 
interpreted as, mental ages, This age-type score, repre¬ 
senting the amount of development up to date, has much 
greater meaning to the layman than does the "standard 
score” or the "percentile score” and for that reason the 
age norm has been used in this scale However, the term 
"mental age” has not been used because the M A would 
undoubtedly suggest a Binet Mental Age which in tuin 
would suggest the corresponding M A of the hearing 
child and thus lead to false comparisons For this reason 
and because of the fact that the test items have been 
selected, m many instances, because of their similarity to 

223 



EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


the abilities which the deaf child must exhibit m school, 
the term “Learning Age” is used instead 

An L A. of 5-0 simply means that, according to the 
results of this test, the child is able to do those tasks which 
the average deaf child of five yeais is able to do, or, that 
he should be able to solve problems WLth the same average 
efficiency as the average deaf five-year old 

It is recommended that in the mtei pietation of test 
results, the learning age be used instead of the learning 
quotient (L,Q , derived by dividing the LA by the C A , 
similai to the I Q ) Until moie conclusive evidence 
regarding the respective influence of environment and 
heredity on the mental development of the child, resulting 
from more carefully conti oiled experiments, is produced, 
one must proceed cautiously to insuie that he is not clos¬ 
ing the door of oppottunity to any child If there is a 
reasonable question as to whether the hearing child can 
be improved through a stimulating progiam of training, 
is it not likely that this question will assume even larger 
propoitions in the case of the deaf child? 

Statistical Analysis of Test Data 
The accuiacy of any test is dependent, not only upon 
test items employed, but also upon the number of indi¬ 
viduals examined, the representativeness of the group, the 
accuracy with which the test has been scored, the deriva¬ 
tion of accurate and meaningful norms, and vanous sta¬ 
tistical applications which aie used for the purpose of 
checking, or for interpretation An additional and more 
detailed statistical treatment of the data will be made at 
some latei date Such topics as sex diffeiences, effects of 
schooling, relation of scoie to degree of hearing loss, 
resemblances of scoie to teachei judgment of ability, and 
the results of a factorial analysis of the test items, aLe 
among those so reserved 

Adequacy of the standai dization . Perhaps the mam 
criterion for the standardizing of any test is the selection 
of representative populations at each age The method 

224 



PERFORMANCE TEST FOR DEAF CHILDREN 


employed in meeting this problem has been described 
briefly above, 1 e , the testing of all available pupils 
(within the desired age range) in a rather widely scat¬ 
tered group of state schools foi the deaf. However, to 
check the adequacy of the sampling of cases, a table of 
peicentages of scores for each item was made which did 
not include the students of the Indiana school and the 
Ohio school Fiom this list of percentages, a table of 
norms was deuved These noims were then compared 
with the norms derived from the total group In 89 
per cent of the cases, the noims weie found to be iden¬ 
tically located and in the remaining 11 per cent of the 
cases they varied not more than sis months This would 
indicate that the sampling was probably sufficient for 
determining relatively stable norms 

TABLE 1 

YEARS-IN-SCHOOL DISTRIBUTION BY AGES AND 
A COMPARISON OF I HE MEAN CA's AND THE MEAN LA's 
FOR TIIE STANDARDIZING GROUP 


Yelrs jji School 

Age 

0 

1 

2 

3 

_4 

5 

6 

Toial MeanCA 

MeanL A 

4 

9 

1 






10 

4-1 

4-48 

5 

Si 

9 






42 

5-0 7 

5-1 8 

6 

39 

16 

4 

1 




60 

6-0 3 

G-3 5 

7 

22 

42 

15 

4 

1 



84 

7-0 

7-2 3 

S 

11 

31 

27 

14 

4 



S7 

7-114 

9-0 7 

9 

6 

29 

43 

31 

5 

3 


117 

8-11 7 

9-2 9 

9-9 

5 

9 

15 

17 

12 

6 

2 

66 

9-9 

9-6 5 

Total 

125 

137 

104 

67 

22 " 

9 

2 

466 




Table 1 gives the number of individuals tested at each 
chronological age level The small number of cases in the 
lowest age group means that the noims will be less reli¬ 
able at the age of four This table also shows the mean 
chronological age and the mean learning age for each of 
the age groups In no instance does the mean chronolog¬ 
ical age differ moie than one month from the desired 
chronological age The mean learning ages likewise cor¬ 
respond closely to the mean chronological age At each 
age level, except one, the mean L A is slightly higher 
than the mean C.A It is felt that this is a desirable fea- 


225 






EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


tadlh 2 

PER CENT OF EACH AGE GROUP 


HEAD STRIHCING_11 OCK DTJILUING 


Total per 2 Mm 

Age 7-8 9-10 11-1213-1+ 

Total Pnltems 
I 11 III IV 

V 

_1 _ 

2 

Total Score 

3 4 5 6 

7 

i 

+ 

100 

90 

40 30 

40 

10 




100 

IOO 

30 

10 

10 




S 

100 

91 

79 55 

60 

29 

12 



100 

100 

SI 

40 

17 

2 



6 

LOO 

97 

9+ 87 

92 

47 

20 

5 


100 

100 

98 

82 

52 

27 

S 

3 

7 

too 

99 

99 97 

96 

87 

45 

12 

2 

100 

99 

99 

90 

73 

48 

24 

6 

8 

100 

100 

100 100 

100 

95 

67 

32 

9 

100 

100 

99 

94 

85 

69 

44 

11 

9 

100 

100 

100 100 

100 

99 

SB 

57 

19 

100 

100 

99 

98 

95 

83 

66 

39 

9-9 

LOO 

100 

100 100 

100 

100 

89 

_70_ 

21 

100 

100 

100 

100 

98 

95 

80 

50 




PICTORIAL 

ASSOCIATIONS 






PAPER FOLDING 





Total Score 







Total Score 



Age 

1 

2 

3 4 5 

6 

7 

8 

9 

10 

11 

12 1 

2 

3 

4 

5 

6 


+ 

100 

100 

100 90 60 

20 

20 





100 100 

100 

40 

10 



5 

100 

100 

100 93 79 

40 

26 

12 

2 



100 100 

100 

S3 

71 

26 


6 

100 100 100 97 90 

80 

62 

43 

20 

S 

2 

100 100 

100 

97 

90 

72 

U 

7 

1D0 100 

99 99 95 

89 

83 

67 

54 

23 

6 

1 100 100 1D0 

98 

95 

89 

61 

3 

100 

100 

1D0 99 99 

97 

9+ 

85 

67 

51 

18 

6 100 100 

100 

99 

99 

96 

69 

9 

LOO 100 

100 100 99 

99 

97 

96 

82 

73 

39 

12 LOO 100 100 

100 100 

99 

94 

9-9 

100 

100 

ioo ioo loo ioo ioo : 

100 

91 

82 

58 

H 100 100 

100 100 100 100 

95 




MEMORY 

tor coiQH.ro qujrcu 




MEMORY FOR 

Age 5 6 

7-8 

9 

Total Score 
10 11 12 

13 

14 

15 

16 

17 

Part A Totals Not in Order 
3 4 5 D C D E 

+ 

100 

90 

40 

20 

10 







100 80 10 10 




5 

100 

98 

62 

26 

14 

2 

2 





100 88 45 64 

10 

10 

ID 

6 

100 

98 

90 

70 

52 

27 

15 

5 

2 



100 95 87 97 

65 

10 

10 

7 

100 

99 

98 

89 

11 

5+ 

31 

13 

G 

1 


100 99 97 98 

83 

38 

15 

8 

100 

100 

99 

93 

8+ 

71 

60 

40 

22 

11 

3 

100 100 100 100 

94 

59 

17 

9 

100 

100 

100 

97 

91 

85 

73 

56 

43 

27 

5 

100 100 100 99 

94 

79 

37 

9-9 

100 

100 

100 

too 

100 

95 

SO 

62 

48 

29 

17 

100 100 100 100 

98 

83 

52 


ture If there were an inadequate sampling of subjects, 
it would likely be of the group with limited ability, since 
the mentally less advanced aie less likely to have entered 
school at a reasonably eaily age than are those mentally 
advanced The test ceiling is not high and this may be 
responsible for the fact that the gioup at the upper end 
of the range has a mean L A slightly below their mean 
C A In only two instances do the two means deviate by 
as much as three months—the greatest deviation being 3 8 
months at the four-year level, 

226 






PERFORMANCE TEST FOR DEAF CHILDREN 


MAKING EACH SCORE ON EACH TYPE OP ITEM 


PICTORIAL IDENTIFICATION VISUAL ATTENTION SPAN 


Total Store 

1 2 M 5-6 7-8 9-10 11-13 13-1415-16 17-18 19 20 21-22 23-2+ l 

Total 
2 3 

Score 
+ 5 

6 

100 100 100 100 

100 

90 

20 

50 

40 

10 



100 

70 

20 

10 



«)0 100 100 100 

93 

95 

93 

79 

60 

33 

10 

5 

IDO 

73 

29 

2 



100 100 100 100 

100 

98 

93 

80 

65 

48 

29 

23 

100 

95 

72 

35 

12 

2 

LOO 100 100 100 

99 

99 

90 

96 

9+ 

85 

75 

60 

100 

99 

90 

57 

21 

3 

100 100 100 100 

100 

100 

100 

99 

98 

98 

95 

87 

loo 

100 

9B 

77 

36 

n 

LOO loo 100 100 

100 

99 

99 

98 

98 

9B 

98 

93 

100 

100 

98 

85 

51 

26 

100 LOO 100 100 

100 

100 

100 

100 

100 

100 

100 

98 

100 

100 

100 

8S 

70 

32 




PUZZLE BLOCKS 






PICTORIAL 

ANALOGIES 



1 

2 

Total Score 

3 4 5 

6 

7 

1 

2 

3 

Total 

4 5 

Score 

6 

7 

8 

9 

10 

90 

30 






10 

10 

10 

10 







100 

95 

55 

12 




100 

76 

71 

43 

14 

J 

2 




LOO 

98 

SO 

52 

12 



100 

100 

95 

80 

57 

30 

12 

2 

2 

2 

too 

98 

87 

73 

38 

5 


100 

99 

96 

89 

79 

52 

27 

10 

1 


100 

100 

99 

89 

6+ 

22 

2 

100 

100 

100 

98 

87 

63 

43 

28 

2 


100 

100 

100 

97 

77 

5+ 

1+ 

100 

100 

100 

98 

94 

91 

76 

56 

26 

7 

too 

100 

100 

100 

89 

63 

21 

100 

100 

100 

100 

9S 

95 

83 

64 

30 

S 


DIGITS 






COMPLETION OF 

DRAWINGS 







B 

In Order 

C D E 

1 

2 

3 

4 

5 

Total Score 

6 7 8 

9 

10 

11 

12 

13 

14 





20 

10 

10 












50 

2 



67 

44 

36 

19 

17 

12 

7 

2 

2 

2 





88 

33 

3 


98 

93 

87 

77 

70 

55 

38 

25 

20 

10 

7 

2 



91 

55 

13 


98 

96 

94 

87 

85 

77 

73 

67 

52 

43 

30 

19 

7 


9B 

SO 

32 

6 

100 

100 

99 

9B 

95 

95 

9+ 

92 

88 

74 

53 

34 

16 

7 

98 

83 

53 

2+ 

100 

100 

100 

100 

99 

99 

97 

96 

88 

86 

79 

71 

47 

16 

100 

97 

73 

32 

100 

100 

100 

100 

100 

100 

100 

100 

100 

97 

94 

S3 

61 

23 


Table 2 gives the percentages of subjects in each age 
group making each possible total score for each type of 
test item These per cents were the onefe used in plotting 
the cuives of per cents These curves were smoothed and 
the scoie which revealed approximately 70 per cent of 
success was taken as the score of the average person of 
that chronological age It was then entered in the table 
of norms undei that learning age The learning ages 
given below 4-0 and above 10-0 are the results of an exten¬ 
sion process and, as has been mentioned before, are not so 

227 






educational and psychological measurement 


reliable as aLe those within the age range of the subjects 
examined 

Validity The veiy methods by which the test items 
have been selected and letained aie evidence of their 
validity It will be lecalled that the items were selected 
according to rathei definite criteria and that after they 
had been given they weie subjected to a ligoious item 
analysis Thus the chief catena for validity were (1) 
selection—thiough cutical analysis and adherence to cri¬ 
teria, and (2) inciease in the peicentage passing from 
one age to the next In the piesent scale, it was impossible 
to deteimine validity through coi relations with other test 
scores inasmuch as there is no existing test which would 
have been an acceptable cnterion In the absence of the 
needed cutenon, conclations weie computed between 
the score on the entire scale (the scoie on the entire scale 
is the median learning age of the learning ages obtained 
on the several paits of the scale) and the score on each 
group of items The coi relation setup is seemingly a 
spurious one since a part of the test has been corre¬ 
lated with the whole test which includes this part As 
the score on the entue scale is the median score of the 
parts of the scale, however, each part has an approxi¬ 
mately equal share in producing this total or final score 
and this in turn lessens or eliminates the possibility of 

TAM 1! 3 

CORRELATIONS BETWEEN THE LEARNING AGE OBTAINED ON ONE 
SECTION OF THE TEST AND THE MEDIAN LEARNING AGE 
OBTAINED ON THE ENTIRE TEST 




Group I 
(Age 4 to 7) 

Gioup II 
(Age 8 to 10) 

1 

Memory for Colored Objects 

m 

740 

2 

Bead Stunging 

S12 

729 

3 

Pictoml Associations 

643 

693 

4 

Block Building 

797 

718 

5 

Memory for Digits 

755 

773 

6 

Completion of Drawings 


702 

7 

Pictorial Identification 

730 


t 

Paper Folding 

843 


9 

Visual Attention Span 

637 

629 

10 

Puzzle Blocks 

i 

734 

ll 

Pictorial Analogies 

t 

7+2 


228 





PERFORMANCE TEST FOR DEAF CHILDREN 


the obtained correlations being spuriously high Since 
the correlations between the learning age obtained on each 
group of items and the median learning age on the entire 
scale aie within the range of fiom 629 to 843, they are 
evidence of high internal consistency and thus, perhaps, 
of high item validity 

The abb) eviated scale. To determine whether a de¬ 
pendable short scale could be assembled, the five types of 
items which showed the highest corielation with the me¬ 
dian learning age foi the entire scale were selected to form 
the abbreviated scale Since some of the groups of items 
do not function over the entire age range, correlations 
were derived separately foi two groups Group I was 
composed of all membeis of the standardizing gioup who 
weie seven yeais or under, and Group II, those who were 
from eight to 10 years of age For Group I, correlations 
with the total scale weie obtained for all gioups of items 
except those which do not function at the lower levels, and 
for Gioup II, correlations with the total scale were ob¬ 
tained for all groups of items except those which do not 
function at the higher levels The best booklets were 
rescored on the basis of these abbreviated scales and cor¬ 
relations were found between the median learning ages 
obtained from the abbreviated scales and the original 
scale The correlation for Group I was 944 and for 
Group II 936 Thus, when time limitations make it nec¬ 
essary, the short forms may be used with a considerable 
degree of confidence These abbieviated forms can be 
given in approximately 30 minutes 

Although it is recommended that the learning age be 
used in preference to the learning quotient (LQ = 
LA / C,A ), in order to make the study more complete 
and significant the writer has made a study of the LQ's 
of the standardizing group Table 4 shows that the mean 
learning quotients derived from the standardizing group 
closely approximate the desired mean of 100 The great¬ 
est deviation is at age four and is probably due to the 

229 



EDUCATI0NA1 AND PSYCIIOI OGlCAI MEASUREMENT 


small number of cases and to the fact that they are a 
somewhat select group 


TAQIB 4 

TIIE MEAN LEARNING QUOTIENT* RANGE* STANDARD DEVIATION 
OF THE LQ’s, AND STANDARD ERROR OF THE MEAN FOR EACH 
AGE LEVEL OF THE STANDARDIZATION GROUP 


Age 

Mein 

LA 

Rmge of 
LO's 

( J 

LQ 

a- 

M 

4 

108 5 

94-127 

10 909 

3 4500 

5 

102 7 

80-124 

10 518 

1 6228 

6 

1044 

65-139 

14 470 

1 3661 

7 

103 4 

43-132 

15 300 

15912 

3 

101 7 

65-137 

15 135 

1 6226 

9 

104 0 

55-134 

14361 

1 3277 

9-9 

99 l 

73-120 

11410 

14040 


The mean L,Q at each age level except the upper 
group (9-6 to 10-0) is slightly above 100, As has been 
mentioned before, this is pvobably a desirable feature 
The lower mean of the upper gioup is apparently the re¬ 
sult of the limited test ceiling The standaid deviations at 
each age agree closely, except at the two extremes where 
the attenuating factors befoie mentioned have influenced 
them In general, the standard deviation of the means is 
approximately 1 6 (disregarding the four-year group) 
Every effort has been made to make the test usable 
and yet have the mechanics as simple as possible The 
record blank (Table 5) has been no exception and has 
been patterned after the one devised by Hildreth and 
Pintner, The record blank is in reality a table of norms 
and the various scores are checked on the blank and the 
median score is calculated* The items of the abbreviated 
scales also are indicated on the blank 

The test items not only aie attractive to young deaf 
children but also they have a rather high discriminative 
value The scale is not difficult to administei or score and 
since it is weighted heavily with tasks similar to those 
which the deaf child must do in the early years of his edu¬ 
cational career, it should be extremely valuable for gain¬ 
ing a better understanding of the abilities of the younger 

230 






Q 
t 1 ** r* 

O 0 

to CC 

E n 
U 

hi) bA 

id cq 

M ^ 

o o 

HH V H 
4> U 

tn cn 
T* TJ 

u u 
-B -u 
rd 

£ 

s 

?S 


g 













EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


deaf children. This is not intended to imply that the 
inexperienced person could give the test satisfactorily, 
The person who is unfamiliar with individual testing 
techniques would have considerable difEculty unless he 
underwent a penod of tiaimng or practice with this scale 
It is quite conceivable that the person who has had some 
experience in individual testing and who has some knowl¬ 
edge of deaf children could, after a period of training in 
which he gave six oi eight practice tests, administer the 
scale quite satisfactorily 


232 



PERFORMANCE TESTING IN PUBLIC 
PERSONNEL SELECTION 

PART I 

SIDNEY W KORAN 1 

Employment Board, Pennsylvania Department of Public Assistance 


Introduction 

I T IS an inteiestmg fact that although the use of per¬ 
formance tests m the selection of public personnel 
enjoys not only the geneial endoisementof personnel tech¬ 
nicians but the enthusiastic and unsolicited support of the 
public as well, theic is probably no other aspect of the 
examination process at present more completely neglected 
by the majority of merit system agencies 

Probably all jurisdictions employ performance tests 
in the selection of typists and stenographers, and the gen¬ 
eral practice is to convert the ratings on these tests into 
quantitative terms capable of combination with scores 
achieved in othei portions of the examination battery 
Beyond that, however, the performance testing of most 
agencies seems seldom to go beyond the administration of 
qualifying tests to a sufficient number of individuals at 
the top of certain registers to satisfy immediate certifica¬ 
tion requirements Except for the case of tests of typing 

JThe author desires to express his appreciation to the following individuals 
Mrs Ruth Glenn Pennell, and Mi Robert Hall Cnig, members of the Employ¬ 
ment Board, Miss Hilda P Thompson, the Eoaid's Executive Director, Dr C 
II Smeltzcr, the Board's Technical Consultant, Miss Kathleen Oyster, Traffic 
Representative of the Bell Telephone Company’s Hainsburg office, Mr Andrew 
S Hay, Service Supervisor of the IBM Harrisburg office, Mr Bernard Gehrmg 
of the Miiltigraph Sales Agency in Harrisburg, and Miss Alice I Thompson, 
of the Penn State Alumni Association 


233 



EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


and stenography, the technique of using the performance 
test as a mn]or pai t of the test battery—that is, as a factor 
which may decidedly influence an examinee’s relative 
standing on the eligibility register—appears to have been 
almost completely ignored 

There ate, of course, various reasons why tins situation 
exists Associated with the technical difficulties inherent 
in the construction and administration of the performance 
test—difficulties which, incidentally, aie fiequently not 
nearly so “insurmountable' 1 as they at first appear—may 
be the factois of cost and already overburdened technical 
staffs In addition, newly created agencies frequently face 
time deadlines which all but preclude their going beyond 
the commonly accepted minimum selection elements, 
namely the use of minimum requirements, a written test, 
an evaluation of training and experience, and, for certain 
positions, an oial interview In addition to these factors, 
howevei, and probably overshadowing them in effect, 
must be mentioned two others general inertia and, prob¬ 
ably very closely related, an uncritical adherence to time- 
honored examination patterns considered satisfactory in 
selecting persons for jobs not requinng the possession of 
manual skills 

Considerable progress has already been made in the de¬ 
velopment of peifonnance tests for the selection of typists 
and stenographers Since their use is so widespread, no 
further attention will be devoted to them m this discus¬ 
sion beyond pointing out that, despite their popularity, 
much necessary work still remains to be done toward their 
improvement, especially in the development of (1) suit¬ 
able standards of performance, (2) satisfactory scoring 
procedures, and (3) improved standardized techniques 
for admimstenng the stenographic portion of the test, 2 

2 Thc Chicago Park District's use of phonographia recordings and the novel 
experiments of the Buffalo Municipal Civil Service Commission and the Arizona 
Unemployment Compensation Merit System Council with radio broadcasting are 
examples of approaches to the problem of minimizing or eliminating the un¬ 
desirable effects of varying dictation speeds and other factors which characterize 
the use of numerous proctors 


234 



PERFORMANCE TESTING IN PUBLIC PERSONNEL 


Techniques have also been developed for measuring 
peilormance in othei jobs such as chauffeur and m certain 
skilled tiades Many companies test applicants for chauf¬ 
feur positions and most state motor vehicle bureaus give 
qualifying driving tests to applicants for operator licenses 
The latter aie ordinarily quite informally conducted, but 
Vitelcs has descnbed (< a trade test of driving skill 1 * 3 which 
could quite leadily be adapted to ment system use. 4 

The New York City Civil Service Commission's ex¬ 
cellent pioneer work in developing tests for such skilled 
trades positions as welder, machinist, electrician, lock¬ 
smith, lineman, and carpentei is quite well known c 
Recently the State Technical Advisory Service of the 
Social Secuuty Board began work on standaidizmg a 
perfonnance test for Key Punch Operators 

In general, however, the use of the peiformance test 
as a measuring instrument designed to serve not only as 
a qualifying hurdle, but also as an impoitant factor in 
determining the examinee’s ielative standing on the eli¬ 
gibility register has received much less attention than it 
deserves The dearth of literature on the subject attests to 
this and probably contributes to the widely held feeling 
that perfonnance tests capable of producing quantitative 
ratings are somehow exceptionally difficult to prepare 
and impractical to administer 

It is hoped that this presentation will illustrate some 
of the possibilities by describing four actual tests used 
successfully by a medium-sized merit system agency, the 
Employment Board of the Pennsylvania Department of 
Public Assistance Small jurisdictions may be able to use 
some of the material with few or no changes Larger 
agencies, especially those with adequate technical staffs, 

S Vlteles, Industrial Psychology (New York W W Norton Company, 
1932), 221-24 

l-The Los Angeles City Civil Service Commission has developed (e9ts of 
this kind, for the positions of Auto Fireman, Ambulance Driver, end Motoi 
Truck Driver 

5 Fifty sixth Annual Report—1939 And First Hnlf Of 1940, Civil Service 
Commission, City of New York 


235 



EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


will probably want to develop their own examinations 
Even the latter, however, if then expenence with this 
type of test has been limited, may find it helpful to con¬ 
sider another agency's approach 

Building the Pei foi mance Test 

The initial steps to be followed in constiucting the 
peifoimance test do not differ in any important respect 
from those basic to the constiuction of wntten tests In 
both, the staitmg point is caieful analysis of the job for 
which the test is to be designed In constructing a per¬ 
formance test the analysis must include actual on-the-job 
observations of both the equipment and the persons doing 
the woik If the test constiuctoi is not himself a compe¬ 
tent opeiator of the machine, it will not suffice foi him to 
confine himself merely to study of the punted job speci¬ 
fications and technical hteutuie and to conveisations with 
expeits and workeis Such an appioach is inadequate 
even in the constiuction of wntten tests, wheie it is all too 
often the lule lathei than the exception, as the only prep¬ 
aration to designing a peifoimance test it can produce 
very unfortunate results 

Every one of the foregoing steps—study of the job 
specifications as part of the agency’s classification plan, 
study of technical liteiatuie available on the equipment 
and on Lts operation, and conferences with skilled workers, 
supervisor, and acknowledged experts in the field—has 
its place in the procedure That place is as a supplement 
to a first-hand acquaintance with the job itself* The pro¬ 
fessional test constructor must analyze the job sufficiently 
thoroughly to permit himself to identify the skills in¬ 
volved and to deteimine their lelationship to one another 
and to the whole, and he must discover those individual 
differences which will provide him with essential clues to 
types of test items likely to prove valid in differentiating 
among various levels of performance ability, 

Here it may be worth pointing out that there is per- 

236 



performance testing in public personnel 


haps no other phase of the examination piogram in which 
the personnel technician is less likely to turn out a satis¬ 
factory job unless he consults with specialists who know 
the practical and technical aspects of the job to be tested 
Both specialists—the personnel technician and the expert 
in the occupational held under consideiation—bring to 
the task certain information and knowledge of techniques 
which need to be leconciled towaid a common end, that 
of producing a valid measuring instrument capable of 
fulfilling the niuneious practical considerations which 
public agencies cannot affoid to forget The test construc- 
toi will want to find an expert who knows the job and who 
is sufficiently piogressive, adaptable, and interested in the 
problems of personnel selection to be coopeiative and 
sympathetic The length of time lequued to orient such 
a co-workei in the problems of testing will not be great, 
and the effort will pay big dividends m the form of a 
smooth working relationship, a valid measuring instru¬ 
ment, and a strong ally in the event of later criticism 

It should be kept in mind that the perfoimance test 
should (1) be sufficiently long to include an adequate 
sample of the differentiating essentials of the job, (2) be 
as inexpensive and easy to administer as possible, (3) 
minimize possible differences in achievement resulting 
from lack of immediate familiarity with the particulat 
model of equipment on which it is given, (4) appear 
sufficiently practical and comprehensive to create a favor¬ 
able impression among those who do not qualify as well 
as among those who do, (5) be capable of uniform admin¬ 
istration to all candidates, (6) be objectively scored and 
produce quantitative ratings 

Setting the Passing Point- 

Since one of the functions of the performance test is 
to eliminate candidates who do not demonstrate adequate 
ability to operate the equipment, passing points must be 

237 



EDUCATIONAL AND PSYCIIOIOGICAL MEASUREMENT 


established with consideiable care Fortunately, this 
problem can usually be approached much more directly 
when performance tests are involved than it can with 
written tests Practical considerations ordinarily make it 
impossible for most jurisdictions to employ standardized 
written tests or to develop satisfactory norms for the tests 
they construct The usual practice, therefore, m agencies 
not bound by restrictive “70 pel cent passing” legislation 
is to permit such factors as the following to influence the 
location of written test passing points the number of 
examinees, the number of openings likely to occur during 
the life of the register, the general caliber of the compet¬ 
ing group, whether or not the examination battery in¬ 
cludes such othei hurdles as a peiformance test or an oral 
interview, and pievious ceitification experiences concern¬ 
ing the ratio of refusals to acceptances 

In establishing the qualifying point for a performance 
test, on the other hand, the principal criterion must be an 
affirmative answer to the question, (f Can the examinee per¬ 
form the task well enough to meet the employer's mini¬ 
mum standards?" Production recoids are ordinarily 
available on types of work sufficiently similar to those 
sampled by the test to serve as the basis for setting the 
elimination point Where such records aie not available 
or are not in usable foim, they can generally be obtained 
quite easily, and profitably, too, during the test tech¬ 
nician's study of the job 

While such data should usually serve as the principal 
basis for establishing the qualifying grade, they ought not 
to be the sole consideration Some of the other factors, for 
example, which it is frequently important to note are (1) 
the level of ability of the agency’s employees as compared 
with that of other persons doing the same kind of work, 
(2) the immediate, and possible future, condition of the 
labor market in the specific field under consideration and 
in related fields, and (3) the possible effects of nervous¬ 
ness, atypicality of the test situation, and other factors 

238 



performance testing in public personnel 


likely to be present and to lower the validity and reliabil¬ 
ity of the examination 

Foui Perfoimance Tests 

The remainder of this presentation will be devoted to 
describing, with a minimum of discussion, some of the 
forms and piocedures developed in connection with per¬ 
formance tests for the following foui kinds of jobs Tele¬ 
phone Opeiator, Graphotype-Addressograph Operator, 
Tabulating Machine Operator, and Duplicating Ma¬ 
chine Operator 

In considenng the material, the reader should keep 
the following facts in nund 

1 Each of the tests was designed for examinees who had "passed" a 
previous hurdle, that of scoring above the 60th percentile in the 
combination of then written test score and training-experience 
rating 

2 The law under which the examining agency operates forbids the 
establishment of any kind of minimum training and experience 
requirements 

3 The operating agency prefers to make no provision for training new 
employees for these positions and requires that a new appointee 
be able to perform the duties of the position almost immediately 

4 Each of the tests was designed to serve as a qualifying examination 
capable of weeding out individuals lacking in sufficient operating 
ability to produce satisfactoiy work, and as a measuring instrument 
capable of producing quantitative ratings of relative ability to per¬ 
form the work 

5 The validity of none of the tests has been determined through the 
use of statistical procedures For the present, their only claim to 
validity is based on the fact that (1) experts in the fields covered 
by the tests state that they measure what they purport to measure, 
and (2) employees certified from registers established on the basis 
of these tests have proved more uniformly satisfactory to their 
employers than those certified from eligibility lists set up from 
examination batteries which did not include performance tests e 

^Studies of reliability and of the relation between performance teat scores 
and service ratings, written test scores, training scores, and experience scores 
are now under way 


239 



EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


The Tesl for Telephone Operators 
This test was designed to be administered to examinees 
in the 20 counties of the state in which vacancies existed 
for the position of telephone opeiatoi in eithei the main, 
or a regional, office of a local BoaLd of Assistance The 
counties were unusually widely scattered geogiaphically 
The dial system was m use in 60 pei cent of the 20 coun¬ 
ties and the manual system in the remainder, but thePBX 
boards were all of the cord-type. The switchboards in 90 
per cent of the offices wete connected to four or more 
trunk lines, the laigest two had 12 and IS tiunk lines 
respectively, The number of extension stations varied 
from 11 to 66, but 60 per cent had 20 oi more 

The preliminary sui vey of the physical facilities avail¬ 
able in each of the 20 counties foi which eligibility regis¬ 
ters were to be established was accomplished by means of 
a letter in which a questionnaire containing the following 
questions was enclosed 
1. Is your PDX switchboard of the cord type? 

2 How many city tiunk lines do you have? 

3 Is a dial part of the equipment of your switchboard? 

4 Is the entne cih in winch your office is located equipped with dial 
telephones ? 

5 How mnny extension stations do you have? 

6 Is there an office neai youi switchboard in wluch there are two 
separate extension lines (not two extensions of the same line) ? 

7 If so, is the office in which these two linens are located within heal¬ 
ing distance of the nngci of a third line? 

8 Would you be willing to permit us to use your switchboard for 
the peiformance test if tile test is scheduled on a Saturday afternoon 
oi on a week day evening when there are few or no business calls 
likely to mterfeLe? 

Twelve examination centers were established The 
factors which dictated their selection were, (1) the type 
of equipment available m each county for which a regis¬ 
ter was to be established, (2) the relative proximity of 
counties with similar equipment, (3) the distance each 
examinee would be required to travel, and (4) the cost of 
administration 


240 



PERFORMANCE TESTING IN PUBLIC PERSONNEL 


The test itself consisted of a series of 13 operating 
situations designed to deteimine the examinee’s ability to 
service incoming, outgoing, and extension calls and tians- 
fers of incoming and outgoing calls, all under as nearly 
normal operating conditions as were practically possible 
to achieve The minimum equipment necessaiy foi the 
administration of the test was a cord-type switchboard 
having four trunk lines and five extensions Two forms of 
the test were required one, Form D, for operators of 
PBX installations in communities where the dial system 
was in use, the other, Fojm M, for manually operated 
installations 

The test was admimsteied by two persons (designated, 
in the test, as Mi Albert and Mr Brown), one of whom 
was requued to be well acquainted with the procedure 
and to have had some practice in its administiation The 
two examiners used separate extension telephones but 
were situated within eaishot of each other and within 
hearing distance of a third extension telephone, 

The administration, lecording, and scoring of the 
test were facilitated by the development of a combined 
“cue sheet” and latmg form designed to serve the four¬ 
fold function of (1) indicating the sequence of opera¬ 
tions so that each examiner would know what his task 
was at every stage of the test, (2) listing the phrases to 
be repeated veibatim by both examiner and examinee, 
(3) enumerating the items on which the examinee was to 
be rated, and (4) providing spaces for the examiners' rat¬ 
ings and comments Each examiner was piovided with a 
copy of this form and, as Mi Albert or Mr Brown, was 
required to originate, maintain, and terminate the calls 
assigned to him and to rate the examinee on each phase of 
every call coming to his attention 

As an aid in orienting the examinee to the test situa¬ 
tion she 7 was piovided with an Instruction Sheet (Ex- 

7 The feminine pronoun is uaed because all examinees for the telephone 
operator peiformance test were women 

241 



LDUCAl10NAL AND PSYCHOLOGICAL MEASUREMENT 

hibit A) which set forth the geneial nature of the test she 
was about to take and listed a few simple instiuctions such 
as any experienced opeiator would need to be given on 
starting a new job When the examinee had been per¬ 
mitted ample time to lead the Instructions, she was as¬ 
signed to the switchboaid Several minutes, if necessary, 
were then allowed to permit the examinee to familiarize 
herself with any aspects of the board which were strange 
to her and to note the location of the jacks and names 
mentioned in the Instiuctions When the examinee was 
ready to begin, the receptionist told her to ring Mr 
Biown’s extension and to read he: identification number 
to him from hei admittance slip This operation, as well 
as the first call placed by the examiner, was intended to 
help "break the ice” and did not entei into the determina¬ 
tion of the examinee’s scoie, 

Exhibit B is a reproduction of the test administered 
to examinees lequired to operate switchboaids in dial- 
equipped communities, While the calls comprising the 
manual form of the test were similar in number and com¬ 
plexity to those included in the dial foim, different cue 
and lating sheets were lequned because several of the 
opeiations (and, consequently, the points to be rated) 
were not the same for both systems 

Some idea of the variety of realistic operating situa¬ 
tions existing during the admmistiation of the test—de¬ 
spite the fact that all of the calls were originated by only 
two peisons—may be gatheied from an examination of 
some of the calls the operator is requned to handle In 
call No 4 (see Exhibit B), the operator connects Mr 
Albert’s extension to a city line A few moments later, the 
operator is telling the peison whose call has come in on a 
trunk line that Mr. Carson’s extension is busy (call No 
5) To maintain the connections required at this stage of 
the test the operator had to put up seven cords In the 
four calls immediately following, the operator was re¬ 
quired to perform these tasks* 

242 



PERFORMANCE TESriNG IN PUBLIC PERSONNEL 


Gall No 6 

answer Mr Albeit’s extension, 

transfer the incoming call from Mr Carson’s extension to Mr 
Albeit’s , 

take down tlie connection fiom one of the tiunk lines and from 
Mi Albeit’s and Mr Biown's extensions 

Gall No 7 

answer Mr, Biown's extension, 

connect Mr Biown’s extension to a city line so that Mr Brown 
may dial his number thiough the central exchange 

Gall No 8 

answci an incoming call, 

inform the person calling that Mr Brown’s line is busy, 
bold the incoming call until Mi Brown’s line is no longer busy 

Gail No 9 

answei Mi Albeit’s extension, 

transfer the outgoing call from Mr Brown’s extension to Mr 
Albert’s extension, 

take down the connections fiom one of the trunk lines and 
fiom Mr Albeit’s and Mr Brown’s extensions 

The scoring proceduje was designed (1) to permit 
the immediate elimination of candidates whose perform¬ 
ance fell below ceitain established minimum standards, 
and (2) to pioduce quantitative ratings reflecting the 
relative operating ability of the examinees who satisfied 
these minimum standards Because both the level of diffi¬ 
culty of the duties and the relative ability of candidates 
to perform the duties vaned in direct lelation to the size 
of the county in which the jobs occuired, minimum stand¬ 
ards (based on the 12 calls compusing the test) were set 
on a class-county basis as follows 1 

Class II Counties 10 completed calls 

Class III Counties 9 completed calls 

Class IV Counties 8 completed calls 

For the pm pose of applying the minimum require¬ 
ments represented by these criteria, a call was considered 
“completed’’ if the opeiation or operations essential to 
recognition of that particular call were carried out suffi- 

243 



EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


ciently well to leceLve ciedit, Thus, an Extension to 
Trunk call was considered to have been completed ii the 
examine! had gianted ciedit f ol the unging stgnal, a 
Tiunk to Busy Extension call if ciedit had been granted 
for the busy ) spoil, Transfemng an Incoming Call if the 
connection was maintained, etc 

The actual steps followed m scoring the test are 
enumerated in Exhibit C, which is a reproduction of the 
instructions furnished the scoieis Two copies of the 
scoung foim (refeued to in Exhibit C as EB-695) are 
reproduced as Exhibits D and E The former is the rec- 
ord of an examinee who did not complete a sufficient num- 
bei of calls to qualify in hei county (Class II), The latter 
represents the latmg of an examinee in a Class III County 
who completed considerably moie than the minimum le- 
quued in her county 

The use of the schedule of ci edits shown m Exhibit F 
made it possible to convert the approximately 75 sub-op¬ 
erations comprising the test into quantitative latings. 0 The 
maximum attainable score foi the test designed for opera- 
tors of dial equipment was 143, foi operators of manual 
equipment, 123 To facilitate the scoring, keys weie con¬ 
structed which turned the task into a routine operation 
easily perfoimed by cleiks experienced in scoring objec¬ 
tive tests 


Patt II of tlus article will appear in the October issue of 
Educauonal and Psychological MuASUREiurNT 


he correlation between the number of calls completed and the score 
derived by applying the schedule of cicdits shown in Exhibit F is naturally 
quite high In a test compusing 20 or 25 calls it would probably be unnecessaiy 
to go to the added double of weighting and scoring each pari of a call How¬ 
ever, several considerations suggested the tlesn ability of doing so in the par 
ticular rest described 


244 



PERFORMANCE TESTING IN PUBLIC PERSONNEL 


Exhibit A 

COMMONWEALTH OF PENNSYLVANIA 
EMPLOYMENT BOARD 
of the 

DEPARTMENT OF PUBLIC ASSISTANCE 

Harrisburg 

Performance Test tor Tflepiione Operators 

Series 1000 
August 1940 

INSTRUCTIONS TO EXAMINEES 

Iwpof innt Failure to follow instructions may 
result in disqualification from the examination 
The examination you are about to take has been designed to test 
youi ability to perform some of the tasks ordinarily required of a tele¬ 
phone operator jji the Dcpnitmcnt of Public Assistance 

When voui turn arrives, you will be assigned to a PBX cord-typc 
switchboaul The designation strips on tins switchboard will indicate 
the location of four or more ti unk hues and the following extension 
stations 

Mr Albeit Mr Carson 

Mr Brown Mr Drake 

Official 

You will be given a few minutes to familiarize yourself with any 
aspects of the switchboard which are stringe to you and to note the 
location of each of the jacks indicated above 

When the Proctor tells you to do so, ring Mr Brown's telephone 
Mr Brown will answer and ask you to lepeat your Identification Num- 
bei—the number which appears on your Admittance Slrp 

The examination; which consists of making various combinations of 
simple connections, will then begin The first connection you will be 
required to make will be a practice exercise on which you will not be 
graded 

When answenng calls or acknowledging orders, the following 
phrases must be used. 

Answering incoming calls—"Public Assistance " 
Answering extension calls—“Yes, please • kM 
Acknowledging orders—"Thank you " 

Note On incoming calls, if the Calling Party requests information 
regarding the Department of Public Assistance or asks to talk to anyone 
besides the four persons whose names are shown on the designation strip, 
the call must be connected to the extension marked "Official/’ 


245 



educational and psychological measurement 


Exhibit B 

COMMONWEALTH OF PENNSYLVANIA 
EMPLOYMENT BOARD 
of the 

DEPARTMENT OF PUBLIC ASSISTANCE 

HarnsbuTg 


Mr Albert □ 

Mr Brown Q 

Performance Tpsi for Telephone Operators 

Series 1000 


1 EXTENSION TO EXTENSION (Practice Exercise) 


Mr Albert lifts receiver 


Operator answers 

Promptness ( ) "Yes, please?" ( ) 

Mr Albert asks for Mr Biown 
"Mr Brown, please " 


Operate! acknowledges 

" 1'liank you" ( ) 

Operator rings Mr Brown 

Lrowu J & phone tings ( ) 

Mr Brown answeis "Mr Brown speaking ” 

Connection completed ( ) 


Connection rtnLntnincd ( ) 

Me Albert and Mi Brown hang up after a 
few seconds 



2, EXTENSION TO TRUNK 


Mr Drown lifts receiver 


Operator answers 

■Promptness ( ) "Yes, please?" ( ) 

Mi Brown asks for city line 
"City llnei please ” 


Operator acknowledges 

"Thank you " ( ) 

Operator connects Mr Brown with trunk line 

En! tone ( ) Promptness ( ) 

Mr Brown dials listed number 

Ringing signal ( ) 


Mr Albert personally checks number of trunk 
11ne to which Mr Brown has been connected _( ) 


3 TRUNK TO EXTENSION WHICH DOES NOT ANSWER 


Mr Brown's call comes in on trunk line 


Operator answers 

Promptness ( ) 

"Public Assistance *' { ) 

Calling party (Mr Brawn) asks for Mr Cm 
3011 "Mr CnrsoHp please n 


Operator acknowledges 

"Thank you « < ) 

Operator rings Mr Carson 

Car soli's plume rings ( ) 

Promptnesg ( ) 

Operator gives ringing report Mr Brown 
tells Operator to continue ringing 

Hinging report every 40 seconds ( ) 

Appropriate phrase { } 


Comiecllon maintained until iranff 

1 fer { ) 


Identification No 


Center 

Date 


Form D 


24 6 


























PERFORMANCE TESTING IN PUBLIC PERSONNEL 


Exhibit B (Continued) 

4 EXTENSION TO BUSY FXTENSION, EXTENSION TO TRUNK 


Mr Albert Iift9 receiver a few seconds after 
Mr Carson's telephone fint rings 


Opeiator answers 

Promptness ( ) 4, YcSj please ?M ( ) 

Mr Albert asks for Mr Brown 

H Mr Dram?, pJease 11 


Operator gives busy report 

Busy report ( ) Promptness ( ) 

Mr Albert asks for city line 
“CiLy line, pleise " 


Operator acknowledges 

■‘Thank you “ ( ) 

Operatoi connects Mj Albert with hunk line 

Dial tone ( ) Promptness ( ) 

Mr Albert dials listed number 

Ringing' signal ( ) 


J TRUNK TO BUSY EXTENSION 


Mr Albert's call comes in on tiunk line 


Operator answers 

Promptness ( ) 

* Public AssJ5iauce * f ( ) 

Calling party (Mi Albert) asks for Mr Car- 
son “Mr Carson, please " 


Operator gives busy report and asks calling 
party to hold line 

Busy report ( ) Hold line ( ) 

Appropriate phrases ( ) 

Calling party (Mr Albert) hongs up 



6 TRANSFERRING INCOMING CALL 


Mr Albert lifts receiver 


Operator answers 

Pmniptncsi ( ) “\cs, please?“ ( ) 

Mr Albert asks to have Mr Cnson’s call 
"r et me have Mr Carson's call, please “ 


Operator acknowledges 

Appropriate phrase ( ) 

Operator transfers incoming call fiom Mr 
Carson to Mr Albert 

Appropriate phnac ( ) 

rrnnsfer ( 1 Promptness ( ) 


Connection maintained ( ) 

Mr Albert and Mr Brown hang up after 
few seconds 



7 EXTENSION TO TRUNK 


Mr Brown lifts receiver 


Operatoi answers 

Promptness ( ) i4 YeSj please? 11 ( ) 

Mr Biown asks for city line 
“City line, pJeaae “ 


Operatoi acknowledges 

‘Thank you “ ( ) 

Operator connects Mr Brown with trunk line 

Dni tone ( } Promptness ( ) 

Mr Brown dials listed numbci 

Ringing" signal ( ) 


Mr Albeit asks Operator numbd of trunk line 

Lo which Mr Brown has been connected ( J 


247 


































EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


Exhibit B (Continued) 

8 TRUNK TO BUSY EXTENSION 


Mr Brown's call comes in on trunk line 


Operator answers 

Promptness ( ) 

‘'Public Assistance 11 ( ) 

Cnlling party (Mr Brown) nsks for 

Mr Brown "Mr Drown, plcrse 


Operator gives busy Leport and nski calling 
party {Mr Brown) to hold line 

Duay report ( ) Hold line ( ) 

Appropriate phrases ( ) 

Calling party (Mr Brown) holds line 

Connection maintained until Leans 
fer ( ) 


9 TRANSFERRING OUTGOING CALL 


Mr Albert lifts receiver 


Operator answers 

Promptness ( ) fl Yes p please?” ( ) 

Mr Albeit asks to have call on Mr Brown’s 
line transferred 

“Transfer the call on Mr Drown' s line to me please ” 


Opciator acknowledges 

Appropriate pbnse ( ) 

Operator transfers outgoing call from Mr 
Brown's telephone to Mr Albert's telephone 

Appropriate phnse ( ) 

Mr Albert listens for open line 

Open line ( ) Promptness ( ) 

Mr Albert and Mr Biown hang up 



10 EXTENSION TO EXTENSION 


Mr Brown lifts receiver 


Operator answers 

Promptness ( ) "Yes please*" ( ) 

Mr Brown asks for Mr Carson 
"Mr Carsorij please” 


Operator acknowledges 

"Th'ink you ” ( ) 

Operator rings Mr Carson 

Carson’s pliofic rings ( ) 

Mr Corson (Albert) answers 
"Mr Carson speikinff ,r 

Connection completed ( ) 


Connection maintained ( ) 

Mr Carson (Albert) and Mr Brown hang up 
after a few seconds 



11 EXTENSION TO TRUNK 


Mr Albert lifts receiver 


Operator answers 

Promptness ( ) "Ye$ p please 7 " ( ) 

Mr Albert asks for city line "City line, please ” 


Operator acknowledges 

"Thank you ” ( ) 

Operator connccla Mr Albert with trunk line 

Bnl tone { ) Promptness ( ) 

Mr Albert dials all but last digit of listed 
number and holds line 



Mr Brown asks Operntor number of trunk line 

to which Mr Albert haa been connected ( ) 


248 






































PERFORMANCE TESTING IN PUBLIC PERSONNEL 


Exhibit B (Continued) 

(2 TRANSFERRING OUTGOING CAIL 


Mr Brown lifts receiver 


Opernlar answers 

Promptness ( ) "Yesj please?” ( ) 

Mr Brown asks to have call on Mr Albeit's 
line transfer red 

"Transfer tlic call on Mr Albert's line to me, please " 


Operator acknowledges 

Appropriate plinse ( ) 

Operator transfers outgoing call from Mr Al¬ 
bert's telephone to Mr Brown's telephone 

Appropriate phrase ( ) 

Mr Brown listens for open line 

Open line { ) Promptness ( ) 

Mr Albert and Mr Biown hang up 



VOICE 

To what extent ig the Operator's voice cleai, distinct, pleasant? 12 3 4 
(1) Very unsatisfactory (2) Unsatisfactory, (3) Satisfactory, (4) Very satisfactory 

REMARKS 


(8-8 40) 


Examiner 


249 











LDUCAT1QNAL AND PSYCHOLOGICAL MEASUREMENT 


Exhibit C 

Procedure for Scoring Telephone Operator Performance Test 

Series 1000 

Note ' All scoring must be checked and must 
carry the initials of both scorer find checker 

1 Check to see that theie are two rating sheets and an Admittance 
Slip for each examinee and that the Identification Number on each 
is identical 

2 Wute the examinee's Identification Number and County in the 
spaces provided on Foim EB-695 (Use Form EB-696 for 
Form M ) 

3 Place a check maik on the lating sheet after the name of each call 
completed by the examinee 

4 Place a check mark in the appropnatc space on Form EB-695 for 
each completed call and enter the total number of calls completed 
in the box provided 

5 Eliminate fiom fuitlici consideration examinees who completed 
fewer than the minimum number of calls required for their County 
(See attached schedule ) 

6 Scoie the rating sheets of examinees who completed a sufficient 
number of calls Place the number of credits after each line and 

place the total numbei of credits for each call in a cucle to the 
right of the last line of the call (See attached schedule of credits,) 

7 Transfer the number of credits for each call to the appropriate space 
on Form EB-695 

8 Place a check maik or an "X” in each of the three spaces provided 
aftei the woid “Trunks" on Foim EB-695 and refer to the at¬ 
tached schedule for the number of credits to be entered in the space 
to the right 

9 Place a check mark in the appiopi late spaces after the woid “Voice" 
on Form EB-695 and lefer to the attached schedule for the number 
of credits to be entcied in the space to the light 

10 Enter the total numbei of credits earned (Raw Score) in the box 
provided on Form EB-695 


250 



PERFORMANCE TESTING IN PUBLIC PERSONNEL 



251 































EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


Exhibit F 

Procedure for Scoring Telephone Operator Performance Test 

Series 1000 

SCIirDUl E OF CREDITS 

Foutis D and 

CREDIT—1 POINT CREDIT—2 POINTS 

Proper pinase Ringing extension telephone 

“Yes, please?” 

"Public Assistance ” CREDIT—4 POINTS 

"Thank you J? Busy report 

Promptness (M) Central answers 

Appropriate phrase 
Ringing report 
Request to hold line 

CREDIT-6 POINTS 

E\tension-to-cxtension connection maintained 

Note If connection is completed but not maintained, 3 points 
(D) Ringing signal (after dialing) 

(D) Open line (call 11 only) 

CREDIT—8 POINTS 

Incoming call transfened (connection maintained) 

Note, If transfer is made without maintaining connection, 4 
points 

(M)Outgoing call transferred (open line) 

CREDIT—10 POINTS 

(D) Outgoing call transfeired (open line—-calls 9 and 12 only) 

USE OF HIGHEST NUMBER TRUNK LINE 
Once (D) 0 (M) 3 

Twice (D) 5 (M) 8 

Thiee times (D) 12 

VOICE 

Satisfactory—2 (each observer) 

Very satisfactory—3 (each observer) 

PASSING POINTS 

Class II 10 completed calls 
Class III 9 completed calls 
Class IV 8 completed calls 

* rt D" in parenihesis indicates that the credit applies only to Form D, 

"M” in parenthesis indicates that the credit applies only to Form M 

252 



SOME DATA ON THE KUDER PREFERENCE 

RECORD 

ARTHUR E TRAXLER 

EdiiCTtiam] RccDich Buremi 

AND 

WILLIAM C MC CALL 

University of South Caiohna 

A WELL-ROUNDED guidance program calls for at 
least four types of objective measuies general intel¬ 
ligence, achievement in various fields of study, aptitudes 
of diffeient types, and interests or motivation Far moie 
progress has been made in the first three of these areas 
than in the fouith In recent years, however, there has 
been an especially large amount of experimentation in the 
last area and some promising measuring instruments are 
beginning to emerge 

The majority of the noteworthy instruments for ap¬ 
praising interests have been concerned with occupational 
preferences The most important woik in this field has 
been done by Stiong, who has constiucted blanks and pre¬ 
pared scales for the measurement of the interests of men 
with respect to 34 occupations and the interests of women 
in connection with 18 occupations Although the instru¬ 
ments developed for the measurement of interest in. spe¬ 
cific vocations unquestionably have important guidance 
values, at least two considerations point to a trend away 
from the measurement of interests in occupations as such 
and toward the measurement of interests m broad fields 
One consideration is based on observation and re¬ 
search, It has been known almost from the first attempts 
to measure vocational mteiests that interests in certain 
vocations are rather highly correlated It has been appar- 

253 



EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


ent that there are clusters of occupations that have so 
many points of similanty that interest in one occupation 
is a stiong indication of inteiest in seveial others Factor- 
analysis studies have given emphasis to this point, For 
example, by means of a factorial analysis of the Stiong 
Vocational Interest Blank, Thurstone 1 found four interest 
groups These groups were associated with science, lan¬ 
guage, people, and business 

The second consideration grows out of the practical, 
everyday work of counselors and personnel officers These 
workers have found that fiequently when one is attempt- 
ing to guide the development of secondary-school pupils, 
or even of college freshmen, guidance with respect to 
specific occupations is not needed In fact, guidance into 
specialization so caily would in many cases be unwar¬ 
ranted What is needed is a valid, reliable measure of 
interests in fauly broad fields so that the individual may 
be guided in the general direction of a gioup of jelated 
occupations, one of which will peihaps be chosen defi¬ 
nitely when the student has attained greater matunty 
Strong, himself, has been one of the first to recognize 
the need foi broader measurement of interests as well as 
measurement of interests related to specific vocations In 
line with this viewpoint, he has lecently published several 
group scales for the measurement of interests m bioad 
areas 

Certain othei investigators have been working along 
somewhat similar lines Probably the most promising 
new instrument in this general field is the Pteference 
Recoid by G F Kuder 2 

Desc) ipfion of the Pi efei ence Recoid 
The Preference Recoid is designed foi use in obtain¬ 
ing measuies of motivation in the following seven fields 
scientific, computational, musical, attistic, literaly, social 

A L L Thurstone, “A Multiple Factor Study of Vocational Intel eats/’ Per¬ 
sonnel Journal, X (1931), 198-205 

Frederic Kuder, Preference Recoid (Chicago Science Research Asao 
elates, 1939) 


254 



DMA ON KUDER FRrFRRLNCE RECORD 


service, and persuasive It consists of 330 paired-compar¬ 
ison items of which the following are samples 

A (1) Drawgiaphs 
(2) Do clerical work 

B (1) Be a lawyer 

(2) Be a landscape architect 

C (1) Sell insurance 

(2) Do scientific research work 

The subject indicates in each case which one of the pair 
of activities he prefers 

The test is intended for use in high school and college 
It is administered without time limit The booklet is used 
with separate answer sheets, one for hand scoung, one for 
machine scoring, and one for self scoring The raw scores 
of an individual student may be plotted on a percentile 
chait and thus a giaphic indication of high points and 
low points with respect to the seven fields may be 
obtained 

Nalute and Pm pose of the Study 

Kuder 3 has described the constiuction of the Prefer¬ 
ence Record in some detail and has reported a consider¬ 
able amount of statistical data for it Helpful as these 
data are, they naturally do not cover all questions about 
the blank Since no other studies of this new instrument 
were available, it seemed desirable to try to obtain 
answers to certain questions befoie arriving at decisions 
about the use of the blank in a regular testing program, 
The questions which this study attempts to answer are as 
follows, 

1 What is the letest reliability of the scores on the 
Kuder Preference Record? 

2 Are the scores on the Prefeience Record relatively 
stable over a long period? 

3 What differences are there between the mean scores 
for boys and for girls on the Preference Record? 

d G F Kuder, “The Stability of Preference Items," The Journal of Social 
Psychology, X (1939), 41-50 


255 



EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


4 Do the mean scores for different secondary-school 
groups change appreciably with change in grade 
level? 

5 What is the shape of the mean profiles of univer¬ 
sity freshmen in different fields of study? 

The data were obtained by admmisteiing the Prefer¬ 
ence Record to fieshmen in the Umveisity of South Caio- 
lma, pupils in Grades 10, 11, and 12 of a high school in 
South Caiolina, and a number of adults who were on the 
staff of an educational oi ganization in New York City 

Reliability 

In the manual for the Prefeience Recoid, Kuder 
gives the following reliabilities for the diffeient scales 
scientific, 87; computational, 85, musical, 88, artistic, 
90, literary, 90; social service, 84, persuasive, 90, These 
reliabilities weie estimated fiom one administration of 
the test to a group of 84 college students through the 
application of the Knder-Richardson method of estimat¬ 
ing reliability coefficients 4 Since the procedure employed 
is still somewhat experimental and is not as yet generally 
used, it seemed advisable to check the reliability of the 
various scales by the moie familial test-retest procedure 
Accordingly, 52 college freshmen and 90 high-school 
pupils who had filled out the Pieference Record near the 
beginning of the teim were retested after an interval of a 
few weeks The elapsed time was approximately one 
month for the high-school pupils and two months for the 
college students The correlations between the scores re¬ 
sulting from the two admmistiations are shown in Table 
1 Means and standaid deviations of the distnbutions are 
also given 

For all scales, the correlations between the two admin¬ 
istrations of the Preference Record to the secondary- 
school group are above 8 They vary from approximately 
81 to about 91 With the exception of the coi relation 

4 G E Kuder and M W Richardson, "The Theory of the Estimation of 
Test Reliability,” Psychomtluka, II (1937), 151-dO 

256 



DMA ON KUDER PREFERENCE RECORD 


TfVnrE l 


RETEST RELIABILITY OF THE KUDER PREFERENCE RECORD BASED 
ON THE SCORES OF SECONDARY SCHOOL PUPILS AND OF COLLEGE 
FRESHMEN IN SOUTH CAROLINA 


Secondary School Pupils 

College Freshmen 


N 

r P E Mx 

SDx My 

SBy 

N 

t PE Mx 

SDx 

_My 

SDy 

Scientific 

'90 

907±01J 41 B0 

9 33 41 B3 

9 25 

52 

78?+ 03(3 42 0B 

9 6 B" 

43 65 

9 19 

Computi 
lion a 1 

90 

814+024 19 B2 

7 24 19 3B 

6 75 

52 

74B+041 18 15 

7 08 

18 73 

6 70 

Musical 

90 

876+01? 16 69 

7 39 16 24 

7 iS | 

52 

871+023 18 92 

7 la 

17 54 

6 49 

ArtisUc 

90 

857+ 019 30 87 

8 97 31 10 

8 04 1 

52 

820+ 031 30 92 

BBG 

29 85 

0 36 

Literary 

90 

B6J± 0] 8 32 30 

9 71 32 80 10 14 , 

52 

7B9+ 035 31 56 JO 10 

34 27 

10 67 

Social 




1 






Ser\ ice 

90 

H3S+ 02-1 39 97 

9 66 41 20 

10 15 

52 

588+ 061 41 37 

9 42 43 54 

8 02 

Persmsive 

90 

B30±Q21 <5 27 

9 25 46 27 

8 95 i 

52 

795± Q34 45 62 

9 30 

46 65 

_9 16 


for the social seivice scale, the correlations between the 
two administration^ of the test to the college gtoup are 
above 74 They range upwaid to approximately ,87 In 
general, these coefficients are rather high lor conelations 
based on retesting after an interval of several weeks, In 
fact, most of the conelations seem exceptionally satisfac¬ 
tory foi a measunng device that can be administered and 
scored so quickly and that yields as many as seven scoies 
The conelations based on the secondary-school group 
ate high enough to wariant consideiable use of the Pref¬ 
erence Record in individual prediction and guidance 
Reliability coefficients above 90 aie theoretically desir¬ 
able for a test that is to be used in this way, but experience 
indicates that they are very seldom attained in a test that 
yields seveial different scores 

The correlations for the college freshmen tend to be 
lower than those for the secondary-school pupils, 
although there is no significant difference in the case of 
the musical scale The correlation foi the social service 
scale, 588, is much the lowest in the group The second- 
aiy-school data indicate, however, that this scale is not 
less reliable than some o± the others 5 

The reason for the somewhat higher correlations at 
the secondaiy-school level than at the college-freshman 
level is not entirely clear It was thought at first that pos- 

B Smce this correlation was out of line* with the others, it was recheeked with 
the greatest of care Every paper was rescored, the data were redistributed, 
nnd the entire calculation was carried through a second tune It seem9 certain, 
therefore, that no eiror of a clerical natme js involved 

257 




EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


sibly the combining of the three high-school grades into 
one gioup had incieased the variability and thus raised 
the correlations Howevei, a comparison of the standard 
deviations shows that m general the variability is not 
greater foi the high-school gioup 

The diffeience in magnitude of the couelations may 
be due simply to the longer time mtei val between admin¬ 
istrations of the Piefeience Recoid to the college group 
By the same reasoning, the coirelations foi the high- 
school pupils may be a little lower than they would have 
been if only a few days had elapsed befoie the test was 
repeated It is impiobable, howevei, that the basic inter¬ 
ests and motives of either high-school 01 college students 
change significantly duiing a peiiod of a few weeks 
Moreover, the lepetition of the Piefeience Recoid after 
a veiy buef peiiod would have been subject to the limita¬ 
tion that a memoiy factoi might have pioduced spunously 
high corielations 

The retest couelations based on the secondaiy-school 
group conespond rathei well with the estimated reliabili¬ 
ties reported by Kudci The con elation obtained in this 
study for the scientific scale is a little highei than Kudei’s 
figure. The two sets of i eh abilities for the musical scale 
and the social seivice scale would agree exactly if those 
repoited heie weie lounded to two decimal places. In 
the case of the other four scales, the retest couelations foi 
the secondary-school pupils are lower than Kuder's reli¬ 
abilities, but the diffeiences are not maiked 

The college-freshman retest correlation for the musi¬ 
cal scale is in very close agreement with Kuder’s reliabil¬ 
ity coefficient. The college-freshman couelations for the 
other scales aie significantly lower than those found by 
Kudei, but the only stuking difference is between the 
couelations foi the social service scale 

Means and Standard Deviations 
Although this is not concerned with one of the main 
questions raised in this study, it may be noted in passing 

258 



DATA ON ICUDER PREFERENCE RECORD 


that the means and the variabilities of the distributions 
lesultmg from the two admmistiations of the Preference 
Recoid tend to be closely similar in both gioups Appar¬ 
ently the practice effect was negligible > that is, the scores 
did not tend to be higher on the second administration as 
a lesult of the subjects’ having taken the test pieviously 
The absence of evidence of piactice effect is a further 
point in favoi of the Pieference Recoid 

Anothei interesting observation based on the means 
and standard deviations shown m Table 1 is that, on the 
whole, the difference between the two groups in cential 
tendency and vailability aie slight This observation sug¬ 
gests that interests in the seven areas involved aie rela¬ 
tively mature by the time pupils enter the secondary 
school The laigest difference in favoi of the college- 
freshman group is found in the social seivice scale, a 
result which familiarity with scoies of high-school and 
college students on the Strong Vocational Interest Blank 
would lead one to expect 

Stability of the Scoies 

We have just noted that the retest conelations for the 
scales of the Kuder Prefeience Record tend to be rather 
high foi an interval of a few weeks But how high would 
they be for a rather long penod—let us say, a year or 
more? In other words, what is the stability of the scores 
and what is their value for long-time predictions? Some 
information relative to these questions is provided by the 
correlations in Table 2, which are based on the retesting 

TADLE 2 

CORRELATIONS BETWEEN SCORES MADE BY SIXTEEN ADULTS ON 
TWO ADMINISTRATIONS OF THE KUDER PREFERENCE RECORD 
AFTER AN INTERVAL OF APPROXIMATELY FIFTEEN MONTHS 


Scale 

N 

r PE 

Mr 

SDx 

My 

SDy 

Scientific 

16 

1128+053 

47 25 

12 14 

+7 50 

10 9+ 

Computational 

16 

864+ 0+3 

24 56 

0 27 

23 81 

8 68 

Musical 

16 

933+ 022 

22 00 

8 46 

22 63 

831 

Artistic 

16 

698+ 086 

31 38 

6 90 

31 25 

8 35 

Literal} 

H 

S10T OJS 

4+35 

S 48 

4+25 

8 13 

Social Service 

16 

611+ 106 

42 38 

717 

41 50 

8 29 

Persuasive 

16 

883+ 037 

31 38 

11 70 

32 88 

12 42 


259 






educational and psychological MEASUREMENT 


of 16 adults with the Preference Record after an interval 
of appioximately IS months 

The coirelations in Table 2 lange from above 6 for 
the social service scale to above 9 foi the musical scale 
The correlations foi all the scales except the aitistic and 
social scivice ones are above 8 The lehability of these 
coi relations is of course limited by the small number of 
cases For example, the con elation coefficient for the 
artistic scale was loweied consideiably by a marked 
change in the score of one person 

Nevertheless, the correlations in Table 2 suggest that 
in gencial the scoies on the Prefeience Recoid are rather 
stable foi a period as long as IS months and that they 
provule a fauly satisfactory basis for long-tune predic¬ 
tions, Emphasis is given to this point when one examines 
the piefei ence profiles of the different individuals In 
neaily all cases, the high and the low points lesulting 
from the first administration of the test weie closely simi¬ 
lar to those based on the second administration, The pro¬ 
files for two individuals, in terms of percentiles, are 
shown in FiguLes 1 and 2 

Sex Dijjeiences 

When one is inteipieting the profile of an individual 
on the Pieference Record, it is of some interest to know 
whether there aie characteristic differences between the 
average scores of boys and guls The mean scoies made 
on the various scales of the Pieference Record by groups 

TADLE 3 


MEAN SCORES OF GROUPS OF BOYS AND GIRLS IN HIGH SCHOOL 
AND IN COLLEGE ON THE KUDER PREFERENCE RECORD 


Scale 

High 

School 

Boys 

High 

School 

Guls 

Freshman 
Boys 
in a State 
University 

Freshman 
Girls 
in a State 
University 

Freshmen 
in <i 

Girls’ 

College 

Number of Cases 

m 

135 

203 

173 

534 

Scientific 

+8 1 

37 9 

47 9 

41 4 

41 7 

Computational 

21 6 

16 9 

21 8 

17 7 

17 8 

Musical 

13 2 

19 2 

IS 8 

19 7 


Aitistic 

27 4 

29 9 

26 9 

321 

29 7 

Literary 


33 + 

32 8 

36 7 

3$ 8 

Social Service 

37 8 

4+8 

40 7 

45 4 

46 7 

Persuasive 

+7 9 

4+9 

49 4 

42 8 



260 









DATA ON ICUDER PREFERENCE RECORD 


1 Scientific 5 Literary 

2 Computational 6 Social Service 

3 Musical 7 Persuasive 

4 Artis Lie 

1 2 3 4 5 6 7 



Figure 1 Profile of a Gul Secretary Wirii a 
Long-Standing Intelest in Music 


261 


tiles 











EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


1 Scientific 5 Liteiary 

2 Computational 6 Social Set vice 

3 Musical 7 Persuasive 

4 Aitistic 


1 2 3 4 5 6 7 



Figure 2 Profile of a Machine Scoring Supeivisoi, an Occupation 
for which Interest and Ability in Computation Are Very Important 


262 


Percentiles 




DATA ON K.UDER PREFERENCE RECORD 


of high-school boys and girls and college-freshman boys 
and girls in South Carolina are piesented in Table 3 
As one might expect, in both groups the boys aie on 
the average higher than the girls in scientific, computa¬ 
tional, and persuasive preferences, while the gills surpass 
the boys in musical, aitistic, literary, and social service 
preferences The largest diffeiences are for the scientific 
scale However, even the smallest differences m medians 
amount to neatly 10 percentile points It appears, there- 
foie, that sex differences in preferences should be taken 
into consideration when the scores on this test are 
interpreted 


Guide-Level Differences 

When the results of achievement tests are being stud¬ 
ied, the usual procedure is to intei pi et the scores in terms 
of the not ms for the grade the pupils are in Is it neces¬ 
sary to follow this piocedure with the Pieference Record 
or are the results in different grades so similar that one is 
justified in disregaiding grade level, at least as far as the 
secondary school is concerned? Some information on this 
question is given in Table 4, which shows mean scores of 
boys and of girls in Grades IX, X, and XI of a South 
Carolina High School 


table 4 

MEAN SCORES MADE ON THE PREFERENCE RECORD BY GROUPS OF 
BOYS AND GIRLS IN GRADES IX, X, AND XI OF A SOUTH CAROLINA 

HIGH SCHOOL 


Scale 


Boy*! 



Girls 


Grade 

IX 

Grade 

X 

Grade 

XI 

Grade 

IX 

Grade 

X 

Grade 

XI 

Number of Cases 

48 

77 

27 

42 

72 

21 

Scientific 

48 8 

48 4 

45 8 

36 1 

390 

38 0 

Computational 

20 5 

22 0 

22 2 

16 9 

16 9 

167 

Musical 

11 3 

13 6 

15 2 

18 9 

174 

19 8 

Artistic 

27 I 

281 

25 9 

313 

29 4 

28 8 

Literary 

30 3 

30 S 

30 1 

32 9 

33 9 

32 9 

Socnl Service 

40 0 

37 0 

36 3 

44 0 

446 

47 1 

Persuasive 

47 8 

47 5 

49 2 

42 3 

46 0 

46.4 


Because of the lather small number of cases, the means 
are not highly reliable indicators of the preferences at the 
different grade levels. Nevertheless, the fluctuations in 

263 

\ 





educational and psychological measurement 


mean scoies made by the pupils in the thiee grades are not 
great Moieover, theie is no consistent trend towaid the 
obtaining of highei or lowei ratings with advancement 
in grade level The data in Table 4 are by no means con¬ 
clusive, but, asfai as they may be intei pieted, they suggest 
that different norms for these thiee grades ate not needed 

Mean Pi ofilcs foJ Different Fields oj Study 
The means and standard deviations of scores of Uni¬ 
versity of South Carolina freshmen classified according 
to field of specialization aie shown in Table 5 The per¬ 
centile ratings of the mean scores aie indicated graph¬ 
ically m Figure 3 

ffVDl £ 5 

MEANS AND STANDARD DEVIATIONS OF SCORES MADE ON KUDPR 
PREFERENCE RECORD BY VARIOUS GROUPS OP PRESIIMEN IN THE 
UNIVERSITY OF SOUlII CAROLINA, INCLUDING BOTH MEN AND 

WOMEN 



LiifjmccriiieCD S ) 

Journalism 



Ait 



Education (A II) 

Scale 

N fifeati 

SD 

N 

Mean 

b 

D 

N 

Menu 

S I> 

N 

Mean 

s d 

Scientific 

79 


40 

f 20 

26 

39 

23 

to 

10 

15 

41 

93 

7 

19 

36 

39 

50 

to Ofl 

Cotnnutttlioin] 

79 


29 

5 32 

7to 

11 

92 

5 

48 

15 

17 

b7 

10 

S3 

36 

16 

B3 

6 66 

Musical 

1 9 

14 

44 

<j 45 

26 

18 

77 

6 

ii 

15 

18 

07 

5 

31 

36 

18 

72 

7 65 

Anislic 

79 

,19 

13 

7 17 

26 

25 

23 

7 

22 

IS 

45 

67 

7 

n 

16 

25 

so 

7 9fi 

Literary 

79 


43 

7 44 

26 

54 

id 

6 

55 

15 

31 

13 

S 

5-1 

36 

38 

28 

B 69 

Social Service 

79 

-to 

14 

7 21 

■>6 

41 

00 

0 

3B 

15 

44 

37 

5 

OB 

3d 

44 

73 

11 46 

Persuasive 

, 79 

46 

90 

b od 

26 

18 

15 

9 

33 

IS 

42 

73 

8 

79 

16 

44 

23 

9 43 


Comm erced! 5 ) Se c retarial Science Prc Medicine Pllnriliacy 
Seale_ _ N Mean S DN Mean 5 D N Mean 5 D N Mean S D 


Scientific 

83 

10 

35 

8 

50 

83 

38 

59 

8 

03 

53 

56 

24 

8 

66 

13 

54 

69 

7 80 

ComputiUlonnl 

Musical 

B2 

31 

7i 

6 

47 

83 

20 

86 

7 

n 

53 

17 

04 

5 

31 

13 

16 

54 

5 21 

32 

15 

19 

6 

47 

83 

19 

12 

5 

89 

53 

16 

81 

7 

22 

13 

18 

54 

R 16 

AtII&lU 

82 

24 

9B 

6 

16 

83 

30 

IB 

8 

LI 

53 

26 

77 

7 

fl2 

13 

2B 

OB 

6 62 

Lllerary 

B2 

31 

12 

a 

53 

83 

33 

84 

8 

55 

S3 

34 

28 

7 

95 

13 

JO 

DS 

7 13 

SoJal Service 

82 

40 

B5 

7 

79 

33 

43 

00 

0 

99 

53 

48 

59 

7 

25 

13 

12 

85 

7 0B 

Persuasive 

B2 

54 

83 

a 

66 

83 

45 

17 

9 

16 

53 

14 

62 

10 

55 

n 

43 

62 

8 39 


Arts and Arts inti 

Pre Law Sciences(A B ) Scn.nccg(II S ) 


Scale 

N Mein 

so 

N Meau 

SD 

N 

Mean 

SD 

Scientific 

32 

40 

CIO 

3 

50 

169 

41 

13 

B 

60 

49 

51 

45 

9 

41 

Comnut&lloml 

Musical 

32 

20 

69 

5 

85 

169 

17 

64 

6 

68 

49 

19 

57 

6 

9fl 

32 

16 

69 

6 

22 

169 

19 

n 

7 

32 

49 

16 

18 

7 

17 

Arlislic 

32 

24 

13 

7 

38 

169 

30 

66 

9 

30 

49 

27 

61 

a 

32 

Literary 

32 

33 

00 

3 

65 

170 

35 

71 

10 

21 

49 

34 

43 

a 

59 

Social Sen Ice 

32 

38 

13 

6 

34 

169 

44 

33 

10 

3B 

49 

43 

12 

10 

07 

Persuasive 

32 

37 

94 

10 

99 

170 

44 

2 9 

10 

42 

49 

44 

10 

10 

(.5 


The piofiles of no two groups aie alike, The greatest 
similarity probably is in the profiles for the pre-medical 
and pharmacy groups, but even here the cotrespondence 
is not especially close when all the scores are considered 
Most of the high points and low points in the profiles 

264 






9 39 +9 JO 


Figure 3 Mean Profiles of Groups of Freshmen Who Hove Indicated 
Educational or Occupational Choices in the Fields Named 


265 


EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


occur accoiding to one’s expectation The art gioup, for 
example, is lathet low in the scientific scale but very high 
in the aitistic scale, The journalism gioup is close to the 
thirtieth peicentile on the scientific and computational 
scales, but almost up to the ninetieth percentile on the 
liteiaiy scale The commerce gioup is approximately at 
the eightieth percentile in computational and persuasive 
interests, but close to the thnty-fifth percentile in scien¬ 
tific and liteiary mleiests Both the pre-medical and the 
pharmacy students are high in scientific interests The 
pre-medical students are also fanly high in social seivice 
interests. The pie-law students ale close to the thirtieth 
percentile on the scientific and social seivice scales, but 
near the eighty-fifth peicentile on the peisuasive scale 
and not far below the seventieth peicentile on the compu¬ 
tational scale The engineering gioup is above the sev¬ 
entieth percentile in computational and scientific inter¬ 
ests, but below the thii tieth percentile in literary interests 

In the manual foi the Fiefeience Recoid, Kuder has 
given median piofiles for groups of students who have 
chosen occupations in the fields of wilting, social seivice, 
physical sciences, political science, business and account¬ 
ing, veterinary medicine, medicine, and law A compari¬ 
son of the piofiles in Figure 3 with Kudei’s profiles re¬ 
veals notewoithy similanties between those for (1) jour¬ 
nalism and writing, (2) commeice and business and 
accounting, (3) pie-medical course and medicine, (4) 
pre-law and law, and (5) aits and sciences (B S ) and 
physical sciences The fact that the profiles denved from 
two independent sources for groups in the same general 
areas aie similai and are oil the whole in agreement with 
what one would leasonably expect is favoiable to the 
reliability and validity of the Pieference Recoid 

Conclusions 

1, The retest reliability of the scales of the Kuder 
Preference Record is rather high The correlations be¬ 
tween the scores resulting from two administrations of the 

266 



DATA ON KUDER PREFERENCE RECORD 


Prefeience Record to a gioup of high-school pupils with 
a time interval of about one month weie above 8 for all 
seven scales The corielations between the scores based on 
two administrations of the Record to a gioup of college 
freshmen with a time interval of two months were above 
7 for six of the seven scales 

2 The scoies on the Preference Record do not seem 
to be influenced by piactice in taking the Recoid when 
theie is an inteival of several weeks between administra¬ 
tions of the recoid The mean scores resulting from the 
second administration of the Record were not appreciably 
or consistently higher than the scoies obtained the first 
time the Record was taken, 

3 The scoies on the Prefeience Record appear to 
have consideiable value foi relatively long-time predic¬ 
tions as fai as adults are concerned The correlations 
between the scoies of 16 adults after an interval of 15 
months were faiily high, varying from about 6 to slightly 
above 9 

4 There aie notewoithy sex differences between the 
mean scores of high-school and college boys and girls 
On the average, the boys exceed the girls in scientific, 
computational, and peisuasive preferences; the girls are 
higher than the boys m musical, aitistic, literary, and 
social service preferences 

5 It appears that mteiests and motivation in the 
seven aieas involved are relatively mature by the time 
pupils leach the secondary school The differences be¬ 
tween the mean scores of pupils in Grades IX, X, and XI 
were found to be slight, and theie was no consistent trend 
towaid highei scores with mciease in grade level Simi¬ 
larly, the differences between the mean scores of the sec¬ 
ondary-school and college freshman groups were small 

6 Mean profiles were found for 11 groups of univer¬ 
sity freshmen classified according to field of study or 
occupational choice The profiles tended to have high 
points and low points at the places where one would 

267 



EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


expect them to be A comparison with profiles found by 
Kudei showed that those foi groups in the same geneial 
fields were sunilai in shape 

In general, the data in this article are favorable to 
the Kuder Prefeience Recoid, By far the most important 
question that remains to be answeied has to do with the 
validity of the Record. Do the scales really measure what 
they purpoit to measuie? While certain aspects of the 
data reported in Kuder’s manual and in this ai tide imply 
considerable validity, theie is at present little direct evi¬ 
dence concerning the validity of the Preference Record 
One of the wnteis reported a small amount of data on the 
validity of the Record in Buros’ 1940 Mental Meastne- 
ments Yeat book, but there was nothing conclusive in the 
findings Fmther study of the Preference Record could 
well be directed toward this question 


268 



THE RELIABILITY OF RATIO SCORES 


LEE J CRONBACH 
State College of Washington 


E DUCATIONAL measurements often give rise to 
quotients or ratios obtained when one score is divided 
by another The intelligence quotient, achievement quo¬ 
tient, and per cent accuracy scores aie examples For 
the effective interpretation of such a measure, it is im¬ 
portant that an appropriate estimate of its reliability be 
obtained While a formula for the reliability of ratios 
has been presented by Hohmger, this, like other ap¬ 
proaches, has limitations which apparently have not pre¬ 
viously been discussed The present article is intended 
to summarize the procedures which may be applied to 
ratio scores and to indicate the conditions under which 
each is appropriate 

Ratio scores appear to be particularly important m 
dealing with certain new-type tests such as those now 
being published by the Progressive Education Associa¬ 
tion In Test 1 41, Social Problems, 1 for example, the 
student is presented with a description of a social prob¬ 
lem, asked to state which of several solutions he favors, 
and then to check, from a long list, which reasons he 
would advance to support his decision Since the student 
may check as many reasons as he wishes, there is m prac¬ 
tice a wide range of “Total Reasons” among students 
One important datum is the extent to which the student 
checks reasons which are inconsistent with his conclusion 


*Tset 141, Social Problem; Chicago Progressive Education Association 

269 



EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


and really support one of the other solutions This is 
expiessed in the score “Number Inconsistent ” In order 
to compare one student with anothei, it is convenient to 
eliminate the comprehensiveness factoi, expressing his 
performance m a “Per Cent Inconsistent” score by divid¬ 
ing Numbei Inconsistent by Total Reasons 

Retest method One of the most satisfactory estimates 
of the reliability of such a scoie is to be obtained by the 
retest method This method has geneially been used to 
deteimme the reliability of the IQ It is not easy to 
rule out possible practice effect, even when paiallel foims 
aie used For some tests it is difficult to piepare a paiallel 
form, even where two forms aie available, it is often 
desired to estimate the reliability without the tiouble a 
second testing requnes Once data fiom two forms are 
available, it is a simple matte l to compute the two ratios 
for each student and con elate them It will be shown 
below, however, that a coefficient so obtained may not be 
equally appiopuate foi all scores in a given population, 

Kudet-Richaidson method Where retesting is im¬ 
practicable, it is customary with ordinary tests to use the 
Spearman-Brown split-half procedure or the recently de¬ 
veloped Kudei-Richardson method The Kuder-Rich- 
ardson method is based upon a summation of the variance 
of the items composing the total score, 2 since a ratio score 
cannot be conceived as composed of a sum of items, the 
method is not applicable (except in that special case 
where the denominator of the ratio is a constant for all 
students) Whether a modification of this method can be 
developed which is appropriate foi ratios is not known 

Split-half method The Spearman-Biown formula 
is based on the assumption that the two halves into which 
the test has been split may be added to form the whole 
In the case of ratio scores, this assumption does not hold 
Since the denominator of the ratio is, in general, different 

2 G E Kuder and M W Richardson, “The Theory of the Estimation of 
Test Reliability, 1 ' Psycftometrikaj II (1937), 151-60 

270 



reliability of ratio scores 


in each half, 4^ + 4^ 13 not equal to 4-, of 1/2 4- It 


1/2 


2/2 


would be possible to correlate-^1^ with-^£, this would, 

b b 


by the Speaiman - Brown formula, yield Since 

h b 

this disregards the possibility of enor in measuring the 
denominator it does not estimate the reliability of the 
ratio m the usual sense ofi^_ It is obvious that these 

biha 

objections do not apply to the special case where the de¬ 
nominator is the same for every student, as in the per cent 
accuracy score on a test where every student attempts 
every item Heie, the reliability of the ratio is the same 
as the reliability of the numerator 

Computation by formula, Statistical formulas for 
obtaining the mean, standard deviation, and correlations 
involving ratios by indirect methods were developed by 
early workers These formulas, obtained by assuming 
that the variation of the denominator is small compared 
to its mean, are as follows 


If i ■= ] = -4) = tt* an ^ so ol b then 

O d J-KL a 

Mi ( 1 — r ab v a v b -f V), 

Of 2 = (v a 2 — 2 tebVaVb + V?) | 

__ __ — igjVgV* — r bB y b y 0 + r b AV b Vi _ 

Y{ * 1 Vd V v 2 4- v 2 — 2t ab v a v b V v 2 + v d 2 — 2r oi v 0 v d 

(3 ) 4 




(1) * 

(2) s 


If the reliability of a score is conceived as its correlation 
with itself, one may substitute a for c and b for d in 


U Yule and M G Kendall, Introduction to the Theory of Statistics 
(12th ed , revised, Philadelphia Lippincott, 1937), pp 299 300 

4 Karl Pearson, "On a Form of Spurious Correlations Which May Arise 
When Indices Are Used in the Measure of Organs/' Proceedings of the Royal 
Society, LX (1897), pp 489 ff 


271 



EDUCA110NAL AND PSYCHOLOGICAL MEASUREMENT 
formula (3), obtaining tins formula for the reliability 

1 aa _ ^ aii^a“ ' 2d aJ^a^b ~l~ ^ ttb (^) B 

bb Va - 2 1 ah^oVb "b 'U/ 

This foimula may be employed wherever the necessary 
data can be computed Since i wa , i bh , and the other vari¬ 
ables can be obtained without a letest, data fiom a single 
testing are sufficient Either the split-half or Kuder- 
Richaidson method may be used to estimate these relia¬ 
bilities It must be emphasized, however, that the 
formula is applicable only when the basic assumption is 
valid, namely, when the spiead of the denominator vari¬ 
able is small compaied to its mean 

In actual school testing of a single grade, the variation 
m mental age or chronological age is noimally quite a 

small fraction of the mean M A, or CA The ratio 

for M,A. and C A is likely to be sufficiently small that 
higher powers can be neglected, as a result, the formula 
can be applied to cithei the A Q or the I Q. under this 
condition An empirical test of the formula by Morley 
showed close correspondence between foimula results and 
results from a retest of 381 pupils, all m Grade VIA, on 
several achievement quotients 0 It does not follow that 
the formula can be applied to the achievement quotient 
if several grades are included in one population 

When the Pei Cent Inconsistent scoie is studied, one 
finds that the assumption does not necessaiily hold, The 
average student checks less than 50 leasons, but the range 
in Total Reasons often is from 10 reasons to 70 reasons 
The coefficient of variation of the denominator in such a 
case is so high that error may follow when the formula is 
applied, A further confusion lies in the fact that all 

B This formula was first developed by Holzinger See K J Holzmger, 
“Formulas for the Correlation between Ratios/' Journal of Educational 
Psychology, XIV (1923); 344-47 It mny also be derived directly by approxima¬ 
tion from, expansions of Infinite series 

°C A Morley, “The Reliability of the Achievement Quotient,” Journal of 
Educational Psychology, XXI (1930), 15S-56 

272 



RELIABILITY OF RATIO SCORES 


scores in a given population aie not equally reliable A 
digression to demonstiate this point is necessaiy befoie 
methods of attacking this problem can be presented 

It is well known that the leliability of a test is a func¬ 
tion of the length of the test The student who marks 
three inconsistent reasons out of 10 reasons used, and the 
student who marks 30 inconsistent reasons out of the 100 
he uses, both receive Per Cent Inconsistent scores of 30 
The score of the lattei student is an estimate of his in¬ 
consistency based on 100 responses, the estimate of the 
formei student is based on only 10 If the former student 
weie to niaik one additional reason, his Per Cent Incon¬ 
sistency score would mciease to 36 per cent or decrease 
lo 27 pei cent, if the second student were to mark one 
more reason, his scoie would shift upward only to 30 7 
per cent oi downwaid to 29 7 per cent, depending, of 
course, on whether the additional reason were consistent 
or not Similarly, a change of only one point in the 
numerator pioduces a much greater change in the ratio 
for the student whose denominator score is low From a 
logical point of view, then, we would expect the standard 
eiroi of measurement of a ratio score to increase as the 
denominator decreases The standard erior of measure¬ 
ment and the reliability coefficient are inversely related; 
therefore, the reliability of a score increases as the size 
of the denommatoi increases 

Possibly reference to the per cent accuracy concept 
will furthei clarify this point If we were to ask a stu¬ 
dent a single question, he could answer it correctly or 
inconectly or could omit it If we desired to know the 
percentage of his attempts that were successful, we could 
compute a per cent accuiacy score, which, based on one 
question, could only be 100, 0, or indeterminate Certainly 
no measure of this sort, based on one item, would be con¬ 
sidered significant If two questions were asked, he could 
have both right, both wrong, one right and one wrong, or 
could omit either, or both His per cent accuracy score, 

273 



educational and psychological measurement 


under each of these conditions m oidei, would be 100, 0! 
50, 100 or 0, oi indeteiminate Obviously, when a score 
of 50 per cent is possible, disciimination is liner than 
when only scores of 100 and 0 are possible Similaily, as 
the numbei of items attempted incieases, discrimination 
becomes incLeasmgly fine, which means that accuracy, 
hence leliability, of measurement increases No mattei 
how many items are added to the test, if a student omits 
all the items no meaningful pei cent accuracy score can 
be obtained, and, in geneial, the accuracy with which his 
perfoimance is measuied depends upon the number of 
items he attempts Tumble has suggested 7 that this 
applies not only to the tatio semes, but to scores on any 
testwheie the student is instructed to lespond to as many 
items as he wishes, this pattern is found in seveial tests 
of the Progiessive Education Association senes 

Since it has therefore been demonstiated that the re¬ 
liability of any scoie 4- is a function of the size of b, as 

b 

well as of the test used and the gioup measuied, one may 
raise the question: how can a formula fot reliability of a 
ratio, giving a single answei, be meaningful? The answer 
may be obtained by recalling the basic assumption under 
which the formula was obtained, to wit, that it holds only 
for those cases wheie G b is small compaied to b This is of 
course most likely where either (a) there is little variation 
in b scores within the gioup or (b) values of b are high In 
the formei case, all scores will have about the same re¬ 
liability, which can be estimated by formula (4) In the 
latter case, the value obtained by the formula is a limiting 
value It was pointed out that the leliability of a ratio 
increases as the denominator incieases, othei things being 
equal Since, as the denominator increases, the case is 
approached where powers of aie completely negligible, 

b 

it follows that the value given by foimula (4) is valid 

'flu correspondence with the writer 

274 



RELIABILITY OF RATIO SCORES 


only where the assumption is met, and that for lower 
values of b one may expect a lower reliability 

Another approach to the same type of statistic gives 
a formula foL the standaid eiror of measurement If a 
large number of measuies of a, say a X) a %) a 8) , a», and 

coiresponding measuies b 1} b 2) b 3 , , b Jt are obtained 

for the same person by a senes of n measuiements, a set of 
values ij will also be obtained, The standard deviation of 
this set is by definition the standard error of measurement 
of the ratio From (2), 


<*<! = ^ K? — + V t ‘) ( 5 ) 

M b) 

If one assumes that errors in a are independent ol errors 
in b, when the same peison is tested repeatedly, 



or, if r is used as a symbol for the standard error of meas¬ 
urement, 



where s a and s h can be obtained by the usual methods This 

reduces, using the identity s = a V 1 -— r ih to (4) It 
must again be stressed that this formula is valid only 
where is small enough that powers may be disregarded 

T 

Since the formulas (4) and (7) are valid for some 
sets of scores and invalid for others, some procedure must 
be developed to determine where the formula is appli¬ 
cable A useful test to determine whether the formula 
applies to any set of scores is to obtain values for Mi and 
empirically If the values are close to those found by 
formulas (1) and (2), the assumption may be considered 
reasonable in this case; if discrepancy appears, the 
formula should not be used 

215 



EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


With such a score as Per Cent Inconsistent on 1 41, the 
assumption may hold foi high values of b, but not for 
scores based on small values of b (Total Reasons) In 
this case it is possible to compute by formula the relia¬ 
bility of those scoies wheie the assumption applies, but 
not for cases wheie the dcaominatoL is small To de* 
terrmne the range wheie the foimula can be used, the 
following procedure has been found efficient, 

I A scattei diagmm of a against b is made for the sample 

2, An estimate is made that the assumption will hold for values of b 
gieater than a certain value, say b' For all cases where b is equal 
to or gicatei than b\ the standard deviation and mean of b, and r^, 
aic computed from the appropnatc lows in the scatter dnigiam 

3 In a separate scatter diagiam, i is platted against b t using the same 
class intervals for b as before For all of the cases which fall in iovvs 
so that b ^ b f j cti and Mi are computed by the usual method 

4 Using formulas (1) and (2), tri and Mi are computed from the data 
obtained in step (2) If these values me equal, or vutually so, to 
those obtained empirically foi the same population in step (3), the 
assumption that the variation of h is small compared to b itself is 
piobably justified for b ^ b’ 

5 If the values from steps (3) and (4) aie equal, it is possible that the 
assumption holds for a value b tf b' If the values ham steps (3) 
and (4) aie not equal, it is necessary to test a value b ri > b' In 
either case, a new hypothesis is made, that the assumption holds if 
b ^ b” Using the same scatter diagiams, values aie calculated for 
the means, standard deviations, and i a h for cases at or above b" This 
is a compaiatively simple step, as most of the previous computation can 
be used again Again, the values of Mi and cr[ obtained hy foimula 
are checked against those obtained empirically By a repetition of 
this process, it is possible to dctcimine the smallest value b (n) of b for 
which the statistics derived empirically and by the formula are equal 
within prescribed limits of accuracy. It is piobably unnecessary to 
compare the means, as a check between the estimated and empirical 
standard deviations should be an adequate test, since it requires little 
additional work to check means also, it is probably wise to do so 

276 



RELIABILITY OF RATIO SCORES 


Having identified the lange of b for which the as¬ 
sumption holds, one may compute the reliability oi 
standard eiror of measurement by formula (4) or (7) 
Except for i aa and i bh/ the statistics which enter this 
equation have already been computed in the steps above. 
In many cases it is most simple to compute t aa and i bb by 
the split-half method If the Kuder-Richardson method 
is used, it is necessary to make an item analysis of those 
papers whose h-values are sufficiently high, separate from 
such an item analysis of all papers as would oidinarily be 
made for othei purposes It is possible to plan the item 
analysis m advance, ranking papeis in the ordei of their 
6-scores, so that the Kuder-Richardson method may be 
applied to a portion of the population economically 

Summaiy 

Methods appropnate foi computing the reliability of 
ratio scores have been discussed They aie 

(1) The retest method, which requires construction 
of a parallel form for greatest meaning The necessity of 
a second testing makes this inapplicable in many situa¬ 
tions. A coefficient so obtained assumes that the relia¬ 
bility of all scores in a group is the same 

(2) The Spearman-Brown foimula, applied to the 
correlation between scores based on a splitting of the test 
into two parts This is generally invalid for ratio scores. 

(3) The Kuder-Richardson formula, which may be 
used only where the denominator of the ratio is a con¬ 
stant 

(4) The Holzinger formula foi the icliability of 
ratios, valid only if the variation of scores in the denomi¬ 
nator is small compared to the mean of the denominator 
for the group A related formula for the standard error 
of measurement developed in this paper is valid under 
the same conditions 

It was pointed out that ratio scores within the same 

277 



EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


population, and even latios which are equal in size, may 
not have the same reliability. The standard error of 
measurement increases as the denominator decieases It 
follows that a single reliability coefficient foi a latio score 
is not meaningful, except for data wheie the variation in 
the denominator is small compared to the denominator 


278 



GUIDING STUDENTS TO BECOME 
SELF-GUIDING 


JOSEPH S KOPAS 

Fenn College 

T O adjust oneself to modern living requires a degree 
of persona] development that informal, hit-and-miss 
efforts of the individual do not supply Too many people 
are frustrated and unhappy because their preparation for 
adjusting themselves to oui complex society has been only 
incidental In this day and age a person requires train¬ 
ing specifically duected at teaching him how to get the 
most out of life and the most out of himself He must 
learn to appraise himself, to direct his personal develop¬ 
ment, to take advantage of opportunities foi growth, and 
to evaluate his progress from time to time The devel¬ 
opment of these skills should not be left to chance They 
are far too important for that An organized, formal 
program of training in self-giudance is needed 

In recognition of this need, a guidance program has 
been developed over a period of years at Fenn College 
The use of evaluative procedures is an integral pait of 
the program 

Fenn College utilizes the cooperative plan of educa¬ 
tion The students aie divided into two groups one in 
class at the college, the other in full-time work off cam¬ 
pus At the end of each three-month period they alter¬ 
nate The cooperative work experience received by the 
students is an important factor in the guidance program 

279 



EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


Objectives of the Guidance Program 

At the time the guidance progiam was first consid¬ 
ered in 1931 as an organized activity of the college, n 
seemed logical that the following objectives be kept in 
mind 

1 That the guidance program be an integral part 

of the college piogiam, 

The guidance progiam should perform so essen¬ 
tial a function that the contilbution and progiess 
of the guidance activities could best be judged by 
the progress made by the institution as a whole 

2 That the guidance piogiam be centered on the 

normal student 

In too many cases, because of lack of time and 
personnel, only the maladjusted individuals in 
college get any real assistance from the guidance 
program and the normal student is, to a large 
extent, disregarded By stiessing the preventive 
as well as the adjustment phases of guidance 
woik, and by developing techniques and methods 
which would be helpful to the noimal students, 
it was hoped that the pnmaiy function of the 
guidance program would be to help the normal 
individual 

3, That all faculty member s participate 

It was thought desirable to have eveiy faculty 
member share in the piogram so that all students 
could be assisted properly It was assumed that 
every faculty member could do some formal 
guidance work and that in doing his part he 
could, if given propel help, progiessively qualify 
himself to do moie and to do it better. Further¬ 
more, it was assumed that the counseling expe¬ 
riences could help him to become a more effect¬ 
ive teacher Therefore, participation was ex¬ 
pected to be of personal value to the instructor 
who shared in the program 

280 



GUIDING STUDENTS TO BECOME SELF-GUIDING 


Otgamzation of the Guidance Program 

During the past ten yeais, a guidance progiam was 
evolved which is in line with the above objectives A list 
of the featuies of the progiam, with a brief description, 
follows 

1 The guidance piogram starts with the student 

It helps him face as much of the responsibility 
foi directing, motivating, and appraising the per¬ 
sonal development as is educationally desirable 
It expects and lequires piogressively greater as¬ 
sumptions of that responsibility as the student 
becomes more expenenced and more capable 

2 Each instructor assumes a share of the respon¬ 

sibility m the guidance piogram as a general 

counsel oi 

Each instiuctor acts as a general counselor for 
at least 10 students The counselor is the insti¬ 
tution’s representative who assumes the respon¬ 
sibility of seeing that eveiything within the powei 
of the college is done to help the student carry 
out Ins piogram of development to a successful 
conclusion All foimal guidance woik, except in 
abnormal oi unusual cases, is earned out through 
the counseloi 

3 A guidance specialist is provided to serve as a 

supervisor of the counselors 
His lesponsibihty is to oigamze the program, to 
select and develop techniques, and to provide 
leadership and diiection to the guidance pro¬ 
gram, 

4 Each freshman student is enrolled m a group 

guidance class, called the Orientation Glass 
This feature will be described in detail later in 
this aiticle 



educational and psychological measurement 


5 Faculty case boaid confei ences aie held at the 
end of each qua) lei 

At these confei ences the woik of all students is 
teviewed and the progiess and difficulties dis¬ 
cussed This activity is a very effective part of 
the guidance progiam and piovides very good in¬ 
set vice hauling media m guidance techniques 
and methods foi counselors 

0 Vat tons useful data ate collected and made avail¬ 
able to both the student and the counselor 
These data include test results, records, and other 
vital information 

7 A clinic is piovided to deal with pioblem 
students . 

Specialists pai ticipate in this clinic This feature 
is still in its eaily stages of development 

The Recoid and Planning Folder 
Space does not allow a detailed description of each 
featuie of the piogiam However, the Record and Plan¬ 
ning Folder and the Orientation Class, because of their 
uniqueness and importance m the guidance piogram, will 
be discussed heie in greatei detail 

The common practice is for colleges to keep records 
of the student’s plans, achievements, and experiences This 
practice takes care of the administrative needs, but does 
not give the student an opportunity to learn how to keep 
his own lecords Planning requires that reliable infoi- 
mation be gathered, organized, and used For that rea¬ 
son it is important that the student leain how to keep 
recoids as a part of his training in self-guidance 

Thiec years ago a group of students in an orientation 
class decided to do some pioneeimg work m the aiea of 
personal record keeping, The foim developed was called 
the Recoul and Planning Folder. The students found 
the record very helpful and were quite enthusiastic about 
its value, The folder provided a convenient method of 

282 



GUIDING STUDENTS TO BECOME SELF-GUIDING 


gathering and organizing information necessary for effect¬ 
ive self-guidance 

Most of the information used in the folder was already 
available but seldom organized by the students The fol¬ 
lowing items weie placed in the Record and Planning 
Folder on specially piepaied forms during the freshman 
year* 

1, Peisonal histoiy 

This section includes personal data, such as date 
of birth ) fathers and mother’s names, occupa¬ 
tions, nationality, names and ages of brothers and 
sisters in the family, and the student’s employ¬ 
ment expeilence prior to entrance in college 

2 Autobiography 

The autobiography is a report of approximately 
1,500 words containing the highlights of the stu¬ 
dent’s history 

3 Summary of high school record and experiences 
This summary includes all the subjects taken by 
the student and the grades, listed chronologically, 
as well as the student’s rank in class, honors and 
scholarship, special courses, extracurriculai ac¬ 
tivities, and his appraisal of his high school ex¬ 
periences 

4 Entrance test results 

Each freshman undergoes a two-day testing pro¬ 
gram as part of the freshman week activities The 
tests are of the type which the average instructor 
and student can undeistand and use For that 
reason, the information is made available to both 
the student and the counselor, Areas of testing 
and the tests given ate as follows; 

Geneial Ability Tests, ACE Psychological Examination and 

Otis Mental Ability Test 

General Background Tests Cooperative General Achievement Tests in 

Natural Science, Social Science, Mathe¬ 
matics, and English 


283 



EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 

Training in Speual Subjects Iowa Placement Examinations m Mathe 

mntics and Chemistry Training 

Geneial Aptitudes Iowa Placement Examinations m Mathe¬ 

matics, English, and Chemistry Aptitudes 

Vocational Inteiests. Strong's Vocational Interest Test, using the 

modified scoring 

Special Skills Reading Test 

Peisonnl Chamctciistics A batteiy of tests developed by the author 

5 Repent on the tentative plans foi the school yeai 
This lepottcoveis the geneial plans foi petsonal 
development that have been woiked out with the 
the student’s counseloi In addition to the aca¬ 
demic plans, they include plans foi walk experi¬ 
ence, extiacumculai activities, social develop¬ 
ment, and community and leligious activities, 

6 Scholastic tecotd foi the freshman year 
Subjects taken and the grades received during 
each quarter, point aveiages and rank in class, 
and the student’s appiaisal of the work done each 
quarter are included 

7 Cooperative ivoik expenence 

Freshmen normally start on their coopeiative 
work experience at the end of the thud quarter 
Each student is lequned to write a report about 
this work experience The highlights of this 
repoit, experience received, earnings, the employ¬ 
er’s evaluation of the student’s work, and the stu¬ 
dent’s appraisal of the work experience, are 
placed in this section 

8 Record of unusual experiences and opportunities 

utilized 

This record includes extiacumcular and com¬ 
munity activities in which the student engaged, 
worthwhile social and leligious experiences, as 
well as any unusual activities 

9 Repoit on life philosophy 

As a part of the Orientation Class activities 
the student states his philosophy in terms of a 

284 



GUIDING STUDENTS TO BECOME SELF-GUIDING 


pattern of beliefs He inserts this in the Record 
and Planning Folder 

10 Appraisal and evaluation of progress and experi¬ 
ence during school year 

In this section the student recoids the highlights 
of the appiaisal he and his counselor have made of 
his pi ogress and difficulties, and modifications 
of plans made as the year progressed 

Provisions in the Record and Planning Folder are 
made for the following information for each succeeding 
year 

1 Addition to the autobiography 

2 A leport on tentative plans for each year (Made 
prior to registiation ) 

3 Scholastic tecord 

4 Coopeiative woik (work experience record) 

.5 Recoid of unusual experiences and opportunities 
utilized 

6 Appraisal and evaluation of progress during the 
school year (Made at the end of each school 
year ) 

The Recoid and Planning Folder might very easily 
become one of the most significant features of the guid- 
ance piogiam because it is helpful in so many different 
ways to the student himself and to the faculty members 
who deal with him This coming school year every stu¬ 
dent will maintain the Record and Planning Folder as a 
part of his own guidance efforts 

The Orientation Class 

The purpose of the Orientation Class is to help stu¬ 
dents intelligently plan and carry out a program of per¬ 
sonal development that will lead to successful adjustment 
in major areas of adult responsibilities It is an activity 
which provides opportunity for formal training in the 
process of self-guidance 


285 



EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


The students and instructors jointly and mutually 
assume lesponsibihty for planning, organizing, conduct- 
mg, and evaluating the Orientation Class activities Con- 
sideiation of pioblems of personal adjustment and per¬ 
sonal development makes up the piogiam Three weeks 
are spenL in planning the piogiam, seven weeks in carry¬ 
ing it out, and one week in evaluating it 

The general theme of the course is “Learning to Live 
in Oui Modern Complex Society ” The aieas of adult 
activities which have been chosen to constitute the points 
of emphasis of the course are as follows Learning to live 
with (1) one’s self, (2) others, (3) one’s job, (4) one’s 
government, (5) one’s estate, (6) one’s cultuie, and (7) 
one’s family 

Each student as a member of a committee helps plan 
a program consisting of three one-houi sessions in one 
of the above seven areas that he chooses, and then helps 
conduct the class activities At the end of the week, each 
student turns in a lepoiL which includes his objectives, 
his problems, and his plans for peisonal development in 
the particular area under discussion By the end of the 
quarter each student has written in detail a leport con¬ 
taining seven sections stating how he intends to utilize 
the opportunities for personal growth to be found in col¬ 
lege, m his cooperative woik, and m community activities 

The advantages and importance of such a group guid¬ 
ance activity are readily seen In the first place, the 
students are oriented into the major areas of adult life 
activities and responsibilities, in the second place, they 
are given a demonstration, through group thinking, of 
how a student, by means of choice and planning of activi¬ 
ties, learns to assume more responsibility, exeicises a 
greater use of his intelligence, attains greatei control of 
his behavior, and is able to evaluate his experiences more 
meaningfully, than the student who merely drifts with 
the current of events in college. Finally, in a friendly, 
informal atmosphere, a stimulating environment is pro- 

286 



GUIDING STUDENTS TO BECOME SELF-GUIDING 


vided for the student so that he gets a good start on his 
self-guidance program 

Difficulties Encountet ed 

Difficulties that are encountered in the guidance pro¬ 
gram are common and familiar to all guidance workers 

One of these difficulties is that of making the term 
“learning self-guidance 5 ’ more concrete, both as to what 
is to be learned and how it is to be practiced We have 
found that a practical approach to the difficulty is to limit 
the formal training the students are to receive in the area 
of self-guidance to the following three aspects* 

1 The development of a dynamic outlook on life— 
as a source of direction and motivation, 

2 The acquisition of a basic knowledge of the plan¬ 
ning process—as a means of organizing and direct¬ 
ing one’s efforts 

3 The maintenance of a record—as a means of evalu¬ 
ating one’s pi ogress and as a means of interpreting 
one’s effoits to others 

Each year that the student is in college he has an 
opportunity to formulate and discuss with his counselor 
the objectives of personal development he wishes to 
achieve during the year and plans for achieving those 
objectives, as well as any modifications of his plans or 
objectives made during the year At the end of the year, 
he and his counselor evaluate the progress made If the 
student follows this procedure each year he is in college, 
he will have practiced self-guidance in a very effective 
and worthwhile way 

Another difficulty faced is that of getting the instruc¬ 
tors, who are busy and often not too interested or qualified 
in guidance work, to put the necessary effort into the job 
We have tried to minimize the “too busy” problem by 
(1) making the student more active in the program, (2) 
making the information about students quickly and easily 
available and usable, (3) distributing the task equally 
among all the instructors 


287 



EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


The “not intei ested” problem was tackled by (1) ex 
pecting and giving the instiiictois an opportunity to par¬ 
ticipate on the thcoiy that paiticipation stimulates inter¬ 
est; and, (2) getting them to see that the guidance pro¬ 
gram conti lbutes to the improvement of then mam func¬ 
tion, which is teaching, 

Finally, the "not qualified” pioblem was handled by 

(1) providing in-service training for the lnstnictors, and, 

(2) simplifying the techniques and devices to a point 
where the average teachei can use them 

Anothci difficulty is that of overcoming student indif¬ 
ference and the tendency to drift Guidance implies mo¬ 
tion. Self-guidance, tbeiefoie, implies self-piopelled mo¬ 
tion It is absolutely essential in the guidance pLogram 
that the student take the initiative in developing himself, 
The Onentation Class, the counseling system, and the 
Recording and Planning Foldei help motivate the stu¬ 
dent to take the initiative, lathei than to sit back and 
dnft with the cunent of events 

It is not possible to make an evaluation at this time 
of the complete effectiveness of the guidance piogiam 
A survey of the results up to this date would show that 
about 20 pei cent of the students and 40 per cent of the 
faculty membeis aie doing a good job of their lespective 
parts of the piogiam. Almost one-thud of the faculty 
and one-fouith of the student body aie not functioning 
very effectively The remainder are doing just a fair 
job Admittedly, progress has been slow But the par¬ 
ticipants are becoming more and moie interested as time 
goes on, and the program appeals to be growing in ef¬ 
fectiveness. Progress should be more rapid within the 
next five years now that all the featiues descubed are 
included m the program 


288 



AN ATTEMPT TO MEASURE SCIENTIFIC 

THINKING 


MAX D ENGELHART AND HUGH B LEWIS 

Clucigo Cay Junior Colleges 

P OSSIBLY the most challenging pioblem facing those 
engaged in the construction and use of objective tests 
is the cieation of exeicises which will require the func¬ 
tioning of abilities transcending memory The senes of 
exercises pi esented in this paper may not deseive the label 
of a test of the ability to think scientifically It seems 
justified, however, to piesent the senes in the hope that 
the form of the exercises may suggest to other and moie 
ingenious test makers improved means of measuring abil¬ 
ities which aie among the umveisally recognized goals 
of science instruction 

The senes of exercises given heie follows m its organ¬ 
ization the steps often regarded as the essence of the 
scientific method One would be naive to believe that 
scientific pioblems aie always solved in just these steps, 
01 that the piocesses involved in their solution may not be 
more complex On the other hand, the use of the stages 
lepiesented in these exercises may be appropnate in test¬ 
ing students Although the pioper function of a test is 
measurement, it may still be legitimate to‘recognize the 
function of motivation, and exercises of the type classified 
may also accomplish the puipose of engendenng in the 
minds of students a belief that knowledge of the scientific 
method is important 

When the exercises were constructed, it was felt essen¬ 
tial to present ceitain introductory statements descriptive 
of the scientific method and of the phenomenon with 
which the problem to be solved is concerned The distinc- 

289 



EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 

F 

iron between the directness and indirectness of the contri¬ 
bution of a datum in detei mining the tiuth or falsity of 
an hypothesis may be somewhat artificial It is possible 
that the use of three categories lathei than five would be 
justifiable, One might argue, however, that scientific 
thinking does involve this evaluation of lelevancy of data, 
and that the more duect the contnbution, the greater is 
the dependence which may be placed upon the data 

When the content of the exercises was selected, an 
effoit was made to present a phenomenon which in most 
respects would be novel to the students—that is, the phe- 
nomenon would be novel, but the concepts basically in¬ 
volved would lelate to subject mattei or principles with 
which the students had had some experience The exer¬ 
cises on the opeLation of the radiometer were developed 
from a senes of exercises of a somewhat different type 
which were written by Dr C E Ronneberg of Herzl 
Junior College foi the January, 1939, physical science 
comprehensive examination, It is possible to select other 
phenomena for which pioblems can be stated, and to con¬ 
struct similar exercises There is, of course, no necessity 
to restrict such exercises to the field of physics The par¬ 
ticular phenomenon and exercises presented here were not 
an altogether appropriate selection so far as the group 
tested was concerned, since the level of difficulty was 
too great 

The series of exeicises was included in a test adminis¬ 
tered to students entering the Chicago City Junior Col¬ 
leges who wished to enroll in the second, rather than in 
the first, semester of the physical science survey The 
exercises and their introductory matenals are reproduced 
below 

A scientist, when confronted with a pioblcrn, formulates hypotheses 
winch represent tentative solutions to the problem He then collects 
data which may support 01 dispiove his hypotheses Finally, on the basis 
of the data and the hypotheses thus tested, he derives a conclusion which 
constitutes his answer to the problem 


290 



AN ATTEMPT TO MEASURE SCIENTIFIC THINKING 


The following exercises represent an effort to test your ability to do 
scientific thinking You are to test certain true or false hypotheses, and 
to evaluate ceitam gcneial conclusions Assume that each item of data 
below eacli hypothesis is a true statement and may directly or indirectly 
help to piove an hypothesis tiue or false 

If the application of the item of data requires only one step to prove 
the truth 01 falsity of an hypothesis, then the item is a direct help For 
example, the tempeiatuies of water boiling on a given mountain and 
at sea level would lepresent dnect evidence of the falsity of the hypothe¬ 
sis "watei boils at a highei tempeiature on a mountain than at sea level ” 

If the application of the item of data icqmres moie than one step to 
prove the tiuth oi falsity of an hypothesis, then the item is an indirect 
help For example, the item “water in a container that can be evacu¬ 
ated will boil at loom temperature” indvectly helps to prove the falsity 
of the hypothesis “water boils at a highei temperatuie on a mountain 
than at sea level " 



Phelona of LLrjhl 

PADDLE WHEEL FROM AftOVE 


A number of years ago Sir William 
Ciookes perfected an instrument 
which always intiigues people, 
whethei laymen or scientists This 
is the rachometei, a device consist¬ 
ing essentially of a paddle wheel 
which is fiee to rotate in a hori¬ 
zontal plane witlun a partially 
evacuated gLass bulb One side of 
each paddle is brightly polished, 


while the other side is coated with lampblack As soon as the device 
is placed in the sunlight, the little paddle wheel starts to spin rapidly 
It continues to spin until the device is again placed in the dark 


PROBLEM How does sunlight cause the paddle wheel to rotate? 


Below are given a senes of hypotheses, each of which is followed by 
numbered items which lepresent data After each item number on the 
answer sheet blacken space 

A if the item duectly helps to prove the hypothesis tiue, 

B if the item indirectly helps to piove the hypothesis true 
Q if the item directly helps to prove the hypothesis false 
D if the item indirectly helps to piove the hypothesis false, 

E if the item neither directly nor indirectly helps to prove the hy¬ 
pothesis true or false 


291 



EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 

HYPOTHESIS I In a partial vacuum the paddle wheel rotates be¬ 
cause of the impact of photons of light 

128 Scientists now believe that light has both corpuscular and wave 
characteristics 

129 In a vciy high vacuum the blight faces of the paddle wheel turn 
slowly away fiom the light, while the black faces turn toward the 
light 

130 Light tiavels at the iate of 186,000 miles per second 

131 In a paitial vacuum the black faces of the paddle wheel turn away 
from the light, while the bright faces turn toward the light 

132 Light tiavels at a slowei speed in glass than in air or in a vacuum 

133 Aftei this item number on the answer sheet blacken space A if 
Hypothesis I is true, oi space B if it is false 

HYPOTHESIS II A paddle wheel on which all of the faces are 
blight or dll aic black will pot rotate 

134 The black faces of paddles absoib eneigy from light to a greater 
extent than the bright faces of paddles, 

135, Rotation is due to force of impact If all paddles are the same on 
both sides, cither all blight or all black, the turning fences would 
cancel 

136, More photons lebound fiom blight faces than fiom dark faces 

137 In a paitial vacuum, an molecules are constantly hitting the 
paddles 

138 Photons are hitting the sides of the paddles which face the light 

139 Aftei this item number on the answer sheet blacken space A if 
Hypothesis II is true, or space B if it is false 

HYPOTHESIS III* Rotation in a partial vacuum of the paddle wheel 
is due to the gicater force of rebound of air molecules fiom the 
black faces than from the blight ones 

140 The bright faces lemam coolei than the dark faces, since they 
reflect more light 

141 In a partial vacuum and in the d?uk the paddle wheel will rotate 
when exposed to invisible infrared rays from a warm flatiron 


292 



AN ATTEMPT TO MEASURE SCIENTIFIC THINKING 


142 The black faces of the paddles become warmer than the bright 
faces, since they absorb more light 

143 Air molecules adjacent to the warmer black faces lebound from 
these faces with greatei energy than from the cooler bright faces 

14+ In a very high vacuum and in the dark the paddle wheel will rotate 
slowly if invisible jays from a cathode tube aie directed toward it 

145 After this item number on the answer sheet blacken space A if 
Hypothesis III is true, or space B if it is false 

Below are five conclusions After each coiresponding number on the 
answer sheet blacken space 

A if in your judgment the conclusion is the best answer to the 
problem 

B if in your judgment the conclusion is neither the best answer noi 
the least satisfactory answer to the problem (Thice conclu¬ 
sions should leceive this mark,) 

C if in your judgment the conclusion is the least satisfactoiy 
answer to the problem 

146 The paddle wheel of the radiometer rotates, because air molecules 
move with gieater eneigy when heated by energy from sunlight oi 
from infrared ravs from a flatiron 

147 Air molecules rebound with gieatei foice from the blight faces, 
which reflect more light energy Photons rebound from dark 
faces to a gieater extent than from bright faces The turning 
forces thus cieated cause black faces to rotate toward the light 
in a partial vacuum and away fiom the light in a very high 
vacuum 

148 The paddle wheel of the radiometer rotates, because photons of 
light strike air molecules with greater energy when adjacent to 
the dark faces than when adjacent to the bright faces 

149 The fact that a ladiomctei will operate in either a paitial oi a 
very high vacuum demonstrates that it is not essential that air 
molecules be present in ordei to cause rotation 

150 Air molecules lebound with greater foice from the black faces, 
which absorb moie light energy than the bright faces Photons 
rebound from bright faces to a greater extent than from daik faces, 
The turning forces thus created cause black faces to rotate away 
from the light in a partial vacuum and toward the light in a very 
high vacuum 


293 



EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


In the table below aie presented the proportions of 
correct response and correlations of each of the items with 
the total score on the test and with the total score on the 
part, i e 3 the total score on the 23 scientific thinking exer¬ 
cises These data aie based on an analysis of the answer 
sheets of a random sample of 200 cases The selection 


TA.EILU 1 

ANALYSIS OF ITEMS 


Item No 

Key 

Item Difficulty 

Item-Teat Con-elation 
Correlation with Can elation vuith 
Total Score on Total Score 
Entrnnce Test on Part 

128 

E 

17 

25 

35 

129 

D 

09 

IS 

20 

130 

E 

55 

58 

61 

131 

C 

18 

15 

27 

132 

E 

42 

37 

55 

133 

B 

21 

18 

30 

134 

E 

12 

19 

32 

135 

A 

60 

41 

41 

136 

E 

18 

15 

33 

137 

E 

28 

46 

55 

138 

B 

23 

28 

40 

139 

A 

65 

40 

38 

140 

B 

22 

00 

27 

141 

E 

18 

27 

27 

142 

B 

20 

08 

36 

143 

A 

41 

28 

33 

144 

E 

24 

41 

48 

145 

A 

33 

09 

29 

146 

B 

41 

38 

,38 

147 

C 

15 

20 

20 

148 

B 

50 

35 

45 

149 

B 

39 

09 

19 

150 

A 

39 

19 

17 


was made fiom the answer sheets of all of the students 
taking the test on entrance into the junior colleges The 
reliability of the series was found to be 72 by means of 
a Kuder-Richardson formula The series of exercises 
correlated .64 with the total score on the entrance test 
Seventy-five of the othei exercises were factual, multiple- 
answer items dealing with high-school physics and chem¬ 
istry, and 57 were true-false items pertaining to several 
passages selected from advanced texts in the physical 
science field These latter exercises emphasized aptitude 
more than training in that they were essentially a reading 
test in the field of physical science. 

294 





AN EVALUATION OF TECHNIQUES OF MEASUR¬ 
ING VISUAL ACUITY AT THE COLLEGE LEVEL 

FRANCES ORALIND TRIGGS 
University of Minnesota 

KARL E SANDT, M D 
University of Minnesota Health Service 

T HE UNIVERSITY of Minnesota Health Service and 
the Umveisity Testing Bureau have been coopeiating on 
an evaluation of the Betts Ophthalmic Telebinoculai Test to 
determine whether it is a valid screening test of visual acuity 
for use with college students 

The problem of determining what students should and 
what students should not be referred to an eye specialist is a 
real one because many health seivices do not have such doctois 
on their staffs and students, many of whom cannot afford it, 
must pay for such service individually If there is an eye spe¬ 
cialist on the staff of the health service, it is difficult fot him 
to give each student individual attention The students who 
come voluntarily to him to be examined may be the very ones 
who do not need an examination, and those who do need it 
may never get it, for often a student himself does not know 
when his eyes need attention 

The plan of the research was this The Betts Telebinoc- 
ular Examination was included as a pait of the diagnostic 
reading test battery which is given by the University Testing 
Bureau in cases of suspected leading difficulties At the time 
the student took the Telebinocular Test, he was given a note 
addressed to Dr K E Sandt at the Health Seivice asking for 
a complete eye examination This note was an indication to Dr 
Sandt that the student was to be included in the research It 


295 



EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


was intended to have complete data from these tests on 100 
students, but when the data from the records on the Telebinoc- 
ular and from the Health Service lccoids were tabulated, it 
was found that data were complete for only 87 students The 
measuies on which data were available from both sources 
weie visual acuity in the ught and left eye separately, 
exophoria and esophona neai and far, and hyperphoria near 
and fai If a student wore glasses he was supposed to be 
tested with and without glasses If the student wore glasses to 
the examination at the Health Service and the examination at 
the University Testing Buieau, the data with glasses were 
always used If data weie available only from one examination 
without glasses, the data fiom both seivices without glasses 
were used 1 

In evaluating these data, certain facts should be kept in 
mind The Betts Telebinocular 2 purports to screen out students 
having measurable visual defects serious enough to be con¬ 
sidered for correction by an opthalmologist In a letter to the 
University Testing Bureau fiom the Bureau of Research of 
the Keystone View Company dated Apnl 22, 1940, the follow¬ 
ing bases for icferial weie given "(a) You need have no hesi¬ 
tation about referring a patient who fails on any part of Test 
3, provided that if theic has been a failure on B or C with 
both eyes open you make the test by occluding the eye not 
being tested If there is still a failure the patient should be 
referied (b) Test 4 is seldom failed but if it is failed, with¬ 
out question the patient needs attention (c) The failure 
of Test 5 alone is not a wan ant foi referring the patient, 
but if there have been other failuies, particularly m Test 2 
and 6A, there is no question but what there is poor eye co- 

J It should be remembered when interpreting these data that there is no 
indication as to whether or not this group of students is a selected sample of the 
whole student body as far as visual acuity is concerned There is no reason to 
believe thnt they would be a selected sample on the criteria of visual acuity 
just because they are on rending skills, for it has been shown that there is no 
consistent relationship between these two factors 

2 For complete description of the instrument, see Emmett Albert Belts, The 
Prevention and Correction of Reading Difficulties (Evanston, Illinois Row, 
Peterson and Company, 1936), pp 327-5Q 

296 



TECHNIQUES OF MEASURING VISUAL ACUITY 


ordination (d) With high school and university students 
who complain of discomfort in leading, faduies of Test 6B 
and 7 taken together are indicative of near point trouble, 
and they should have attention " It is upon these bases that 
referral was deteimined by the Umveisity Testing Bureau for 
this study The ophthalmologist is tiained to determine visual 
defects and decide whether they are seuous enough for correc¬ 
tion, thus it may be seen that theie is some overlap of service 
but no oveilap of lesponsibility, the final decision always lying 
with the ophthalmologist as to whether correction shall be 
given In the light of these stated purposes of the Betts test, 
it would seem that the following questions might be helpfully 
answered 

1 Will the Betts test screen out for referral to the oculist 
a large number of students in whom the oculist will find 
deficiencies serious enough for correction? 

2 Will the Betts Telebinocnlar refer a large proportion 
of students whom the oculist finds to have no measurable eye 
difficulty? 

3 By comparing the oculist's and Betts’ records, on 
individual tests, aie all of the Betts measures equally satis¬ 
factory? 

4 On what tests of the Betts Telebinoculai are referrals 
made most often? Can a better basis of referral on the Betts 
Telebinocular be found than the one furnished us by the Key¬ 
stone View Company? 

The question which always arises in research of this kind 
is whether the tests measuie what they purpoit to measure and 
whether, if they were administeied a second time, they would 
give the same results as they did the first time These ques¬ 
tions have never been finally answered for either of the tests 
used in this study It may be that they never can be answered, 
for in both cases the results are dependent upon the physiolog¬ 
ical status of the individual being tested The relationship 
of the effects of fatigue, light, and other factors upon different 
individuals may vary so greatly that a constant score may not 

297 


l 



EDUCATIONAL AND PSYCHO!OGICAL MEASUREMENT 


be possible Or Lt may even be that there aie stilL to be dis¬ 
covered bettei ways of diagnosing visual anomalies 

In this study, the extent to which the two tests agiee on 
diagnosis is indicated, but neithei test is assumed to give a 
perfect diagnosis However, because piescnptions sue finally 
made by the oculist, the extent to which, the Betts would 
tcfci to the oculist those people found by him to need correc¬ 
tion is pointed out 

The following evaluation of data Is presented in answer 
to these foul questions 

1 Of the 87 students included in this study, 13 weie given 
glasses by the oculist, 11 weie given piescnptions to correct 
measmablc physical eye defects, and two students weie given 
glasses meiely to impiove comfoit while leading rather than 
to conect measuiablc eye defects 

Of the 11 students given piescnptions to correct measur¬ 
able physical eye defects, all would have been leferred on 
the critena of rcfeiral sent us by the Keystone View Company 
The remaining two students would not have been leferied on 
the basis of these ciitena Thus it will be seen that all students 
found by the oculist to have measurable physical eye defects 
would have been refeired for complete examinations as a 
result of the Telebmoculai Test 

2 Of the 87 cases which had both the Betts test and an 
ophthalmic examination, 46 would have been icfeiied by the 
Betts test to the oculist on the basis of the critena furnished 
us by the Keystone View Company Of the 46 students who 
would have been icfeiied by the Telebmocular Test, only 11, 
oi 24 per cent, had defects senous enough to be corrected 
by glasses Howevei, it should be remembered that while 
only about S3 pei cent of the gioup would have been refeired 
to the oculist foi complete testing, 100 pet cent would have 
had to be tested, had no pie-test been given While it might 
be desirabte to have a moie rigorous screening test, it is cer¬ 
tainly worth-while to save the oculist from having to examine 
almost half of the students 


298 



TECHNIQUES OF MEASURING VISUAL ACUITY 


3 Thei data which be,n on this question are presented 
in Table 1 Data weie complete from both sources for 87 
students on visual acuity for the right and left eyes For the 
light eye, we find that 20 students, or 23 per cent, failed the 
Betts test and the oculist’s test, 48, oi 55 per cent, passed 
both tests In othei wolds, it was found that the oculist’s 
diagnosis and the Betts’ diagnosis agreed on 78 per cent of 
the cases Six students, oi 7 pel cent, failed the Betts test but 
weie found satisfactoiy by the oculist, and 13, oi 15 per cent, 
passed the Betts test but failed the oculist’s test It should be 
remembered that of these 13 who passed the Betts test but 
failed the oculist’s test, the defect found by the oculist was 
in no case consideied serious enough for coirection 


TABLE 1 

COMPARISON OI RGSULIS DCTT3 TESTS AND OCULIST'S TESTS FOR EIGHTY-SEVEN 

SUDJECTS 


Failed Betts 

and Oculist's 

Passed Betts and Passed Oculist's 

Right Eye 

Left Eye 

Right Eye 

Left Eye 

“No % 

No % 

No °/o 

No % 

20 23 

13 IS 

46 5* 

54 62 

Failed Bett9 and Passed Oculist's 

Passed Betts and 

Failed Oculist's 

Right Eye 

Left Eye 

Right Eye 

Left Eye 


No % No % No % No % 

6 7 10 11S 13 IS 10 11S 


For the left eye, data were again complete for 87 cases 
We find that 13 students, or 15 per cent, failed the Betts 
test and the oculist’s test, 54, or 62 per cent, passed both 
tests Thus the oculist’s and the Betts’ diagnosis agreed on 
77 per cent of the cases Ten students, oi 12 per cent, failed 
the Betts test but were found satisfactory by the oculist, and 
10, or 11 per cent, passed the Betts test but failed the oculist's 
test It should be remembered that of those 10 who passed 
the Betts but failed the oculist’s tests, the defect found by 
the oculist was in no case considered senous enough for 
correction 

On the measure of vertical imbalance (hyperphoria) far 
point on the Betts test, none of the 72 cases on which data 
are complete for both measures failed the test (Two stu- 

299 










educational and psychological measurement 


dents who weie included in the study, but for whom no oculist 
measure of vertical imbalance far point was available, failed 
this test ) The oculist found that 10 students, 01 14 per cent, 
of the 72 students foi whom data weie complete on both meas- 
\ues had veitical imbalance and 62, oi 86 pei cent, did not 

The Betts test foi veitical imbalance neai point did not 
identify any member of the gioup with this difficulty, but the 
oculist found that seven, ol 10 pei cent, of the 72 did have 
veitical imbalance and 65, oi 90 pei cent, did not have vertical 
imbalance neai point These data would seem to indicate that 
this test is of questionable value foi use with college students 

On the lateial imbalance near point, data are complete 
for 72 cases Of the 72 cases, foul students, or six pei cent, 
faded the Betts and the oculist's tests, two, oi ttuee pei cent, 
failed the Betts and passed the oculist’s test, 18, oi 25 per 
cent, passed the Betts and failed the oculist's test, and 48, or 
66 pei cent, passed the oculist’s test and the Betts test Thus 
it will be seen that the two measuies agieed in 72 pei cent 
of the cases 

For lateral imbalance far point, no students of the 72 who 
weie found to be unsatisfactoiy by the oculist failed the Betts, 
but two students, oi three per cent, failed the Betts who were 
found to be satisfactoiy by the oculist; four, oi five per cent, 
passed the Betts test who weie found to be unsatisfactory 
by the oculist, and 66, ov 92 pei cent, weie found to be satis- 
factory on both mcasmes Foi lateial imbalance far point, 
the two measures agieed m 92 per cent of the cases 

This evaluation of tests would Lead us to say that, of the 
parts of the Betts tests studied, the one found to be most 
valuable for lefenal of college students for a complete eye 
examination is the one of visual acuity for the light and left 
eye There is not complete agieement between this test and 
the oculist’s measiuemenls, but it does agiee 78 out of 87 
times for the light eye and 77 out of 87 times foi the left eye, 
and wheie it does not agiee the oculist found no situation to 
exist which was serious enough foi conection This finding 
raises the question as to whether anothei measure of visual 


300 



TECHNIQUES OF MEASURING VISUAL ACUITY 

acuity not requuing expensive apparatus would serve as satis- 
factonly Only one othei measure of visual acuity was given 
these students When students enter the University of Minne¬ 
sota they aie requued to have a physical examination at the 
Health Service As a part of that examination the eyes are 
checked by use of the Snellen Chart 3 The recoid of these 
examinations was on file at the Health Service and has been 
tabulated for considci ation here 

There was a recoid of a Snellen examination for all the 
87 students included in this study On the basis of that exam¬ 
ination eight students would have been leferred to the oculist 
for complete examination For this study it is important to 
deteimme whether these aie the same students referred by the 
Betts test and also to asceitam in how many cases the students 
refer: ed were found by the oculist to have a defect serious 
enough to wan ant a prescription 

Of the eight students refeired by the Snellen test, six 
would also have been leferred by the Betts As has been 
stated, 13 of the 87 students weie found by the oculist to 
have defects senous enough to be corrected by glasses Of the 
13 given glasses, the Snellen Chart examination would have 
referred only two to the oculist These data would seem to 
indicate that on measuies of visual acuity the Betts test is 
superior to the Snellen Chart in refemng students with actual 
difficulty to the oculist foi complete examination 

4 It will be lemembeied that the Keystone View Com¬ 
pany gave four standards for refeirmg a student to the oculist 
on the basis of the Betts test These were 

(a) Failure on test 3 o; any pa) t of test 3 (visual acuity) 

(b) Failure on test 4 (vertical imbalance) 

(c) Failure on test 5 (coordination) with failure on test 
2 (distance fusion) and failure on test 6A (lateral 
imbalance fai point) 

(d) A complaint of discomfort in reading with test 6B 
(lateial imbalance neai point) and test 7 (fusion at 
reading distance) 

Hhd, pp 14-9-51 

301 



EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 

Of the 46 students who would have been referied to the 
oculist on these standaids of refenal foi the Betts test, 43 are 
identifiable by then faihue on the first criterion, i e , failure on 
one or more of the visual acuity tests Two of these students 
also failed the veitical imbalance faL point test Two of the 
students who would have been refened failed on both criteria 
one and fom Two failed cutenon foiu only and would 
have been lefened on it alone One student failed and would 
have been lefcired on cutenon thiee alone 

Thus, examination of oui data would indicate that referral 
on the basis of ciitena one, three, and four and the recoid of a 
complaint of discomfituie would have referied all students 
given prescriptions by the oculist to coirect measuinble phys¬ 
ical eye defects The number lefened would have been 46 
and it would have included the same students lefened by the 
four ciitena given by the Keystone View Company It is 
questionable whether the test of veitical imbalance adds any¬ 
thing essential to this series when used as a screening device 
at the college level 

On the basis of the data just presented it does seem that 
the Betts Ophthalmic Telebmoculai Test can be used satis¬ 
factorily by colleges as a screening device foi referral of cases 
to the oculist The Betts test also stands up more satisfactor¬ 
ily than does the Snellen Chait examination (when given 
under the conditions described) as a measure of visual acuity 
These conclusions should be checked by repeating this study 
on another gioup of students, and they should be accepted 
only tentatively for situations other than those described in 
this study 



THE CONCEPT OF SCATTER IN THE LIGHT OF 
MENTAL TEST THEORY 1 

MAURICE LORR 
U S Civil Service Commission 

RALPH K MEISTER 
Mooseheart Laboratory for Child Research 

T HE CONFUSION and loose thinking among clinical 
psychologists concerning the basis and significance of 
scatter on scales of the Binet type suggest a re-examination 
of the concept of scatter in the light of the theory of psycho¬ 
logical measurement 

Theoretically, on mental age scales of the Binet type, items 
are arranged in the order of their difficulty, the easiest item 
first In clinical practice these items are administered to a 
child in the same sequence Groups of items, supposedly equal 
in difficulty, are allocated to each year level as representing 
the typical performance of individuals of the corresponding 
chronological age The child is given increasingly difficult 
items until he reaches a point in the scale above which he fails 
Actually, no such point exists Instead, the child passes all 
tests at a certain level and continues with mixed successes and 
failures on to the next higher level until he fails all items pre¬ 
sented to him in a given level Such a spiead of successes and 
failures over a numbei of mental year levels is called scatter 
Test theory indicates five possible bases for such irregu¬ 
larity of performance First, scatter is a consequence of the 
lack of perfect correlation between test items resulting from 
the presence of error and from the low communality and 

iThe authors wish to thank Dr M W Richardson for his review of this 
article and Dr M L Reymert for his interest and encouragement throughout 

303 



EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


high specificity of the items The error, as Mosier (10) has 
shown, inci eases m the individual case with greater hetero¬ 
geneity of items Thus, an individual who passes one item at 
a given level may not necessarily pass a second item, either 
because the two items do not measme the same function or 
because of eiroi involved m testing Illustrative of this lack 
of perfect correlation between test items is the Cattell and 
Bristol study (4) in which a mean intercorielation of + 32 
was found foi seven Binet test items, Wright (15) found the 
mean mteicorielation on 31 items to be + 61 

Secondly, there is the fact that the items aie mcorrectly 
allocated in the ordei of difficulty This might be expected 
in view of the fact that Teunan and Merrill (12), for exam¬ 
ple, although using ciuves of piopoitions-passing for each 
item in the ptehrmnaiy giouping of items, had as their goal 
in the final grouping an I Q distubution with a mean of 
approximately 100 It is likely that such a procedure re¬ 
sulted in a giouping of items only loughly ordeied as to 
difficulty Theiefore, although the grouping of test items is 
appioxiinately in the otdei of their difficulty in the sense that 
an item at age four is definitely less difficult than one at age 
10, neveitheless, in adjacent gioups there aie probably a great 
many inversions in difficulty Thmstone's study (14) on the 
absolute scaling of Binet items shows that items at any paiticu- 
lar age level vary considerably in difficulty, a finding conti ary 
to the assumption of relatively equal difficulty among items 
at any one age level He found, too, inversions in placement 
according to difficulty, Thus an item which is easier than 
anothei may be placed at a highei age level In fact, alloca¬ 
tion at the different age levels on the basis of difficulty is 
improbable since the items distribute veiy unevenly in abso¬ 
lute difficulty over the age levels On the basis of Burt’s 
data, Thuistone (14) says, "The test questions tiie more 
numerous at certain ages than at others For example, there 
are 12 questions that scale at par between the ages five and 
six, whereas theie are only foul questions that scale at par 
between six and seven ” Any kind of arrangement that requires 

304 



SCATTER IN MENTAL TEST THEORY 


the same number of tests at each age level js unlikely to result 
in equal giadations of difficulty since the test items used 
do not scale into any such grouping 

This fact of the incorrect allocation of test items at the 
various age levels is brought out by the findings of many 
other investigators Cynl Burt (2) admits that no two 
editors agiee about the coirect order of mental age items 
and cites instances Barber (1), on data for the revised 
Form L, found that five items were significantly easiet and 
six items weie significantly more difficult than their respective 
age placements would indicate Likewise, Harriman (5) 
found (for the Revised Scale) that test items at year level 
XII seem to be more difficult than those at year level XIII, 
a fact which is confiimed by Carlton (3) Krugman (8) 
found that for New York school childien, 25 of the items 
were incorrectly allocated 

Thirdly, scattei may be due to the lack of discriminatory 
power of certain items A highly diagnostic item will dis¬ 
ci iminate simply between individuals with ability above that 
required to respond coirectly to the item and those individuals 
who lack such ability For example, two items may be equal 
in difficulty (JO pei cent pass at, say, age 11), but differ 
widely in diagnostic value The psychometric curve for one 
item may extend over, say, years live to 14 The curve for 
anothei item that is more discriminatory will extend over a 
much smallei age range, such as eight to 12 This spread 
oi scatter is manifestly a result of low diagnostic value 
Thnrstone (14) plotted cuives of piopoitions-passing for a 
random selection of items fiom the Burt-Binet and found a 
“noticeable vanation in the slopes of curves ” It is probable, 
therefoie, that some of the scatter found is due to these dif¬ 
fer ences in the diagnostic value of the items which these 
examples illustrate 

A fourth souice of scatter may be found m the tact that 
there is an increase in variability with an increase in absolute 
mean test performance (13) In other words, individuals 
appaiently become moie vanable as they grow older Since 

305 



I'DUCAIJONAI AND PSYCHOLOGICAL MEASUREMENT 


individuals vary moie among themselves, they must vaiy as to 
the number and types of items failed 01 passed This can be 
easily seen, since the extent to which Thurstone’s “primary 
mental abilities,” foi example, aie present varies between 
individuals and within individuals, and the factorial composi¬ 
tion of test items diffeis fiom item to item within the same 
age level Thus an individual who passes one item at a certain 
age level may not pass anothei because the latter lequires an 
ability which he does not have to the lequued degree Again, 
an individual's failme on five items at a ceitain level is no 
sure indication that he will fail the sixth, since the sixth item 
may require an ability which he has to a marked degree This 
tendency of scattei to inciease with mean test performance or 
chronological age is checked in actual practice between the 
ages 10 and 12 (Rcymeit and Meistei [11]) for within that 
age lange the ceiling of the test begins to limit the amount 
of scatter possible 

A fifth possible cause of scattei is the piesence of sys¬ 
tematic errors in testing due to language handicaps, sensory 
defects, special tiainmg, lack of cooperation, and ambiguous 
scoring or instuictions Unlike chance eirois which influence 
the results as often in one direction as in another and there¬ 
fore can be assumed to cancel out one another, systematic 
errors have a consistent and cumulative effect that gives the 
results a constant bias Obviously if a test shows constant 
bias for a given individual, it is unsuited foi that individual, 
le, the individual diffeis sufficiently from the norm popula¬ 
tion to render the test inapplicable If such a test is given, 
the person with language difficulty will tend to fail highly 
verbal items, the uncooperative individual will answer only 
those question:, which he can be motivated to try, the individual 
with a slight hearing loss may miss the cntical part of the 
question, etc All of these factois tend to lower the basal 
level which per se gives a laigei amount of scatter 

The uses to which measures of scattei have been put can 
now be critically examined in the light of the souices of scat¬ 
ter given above Perhaps the first point that should receive 


306 



SCATlIiR IN MENIAL TEST THXORY 


critical attention is that of the methods of measuring scatter, 
for there are a great many such methods and the amount 
of agreement among them seems to be inversely related to 
their number Theoretically, if scatter represents the range 
of uncertainty of an individual's ability, the range within 
which lies the limen of his ability or the point beyond which 
he will fail all the items—a point which in actual testing 
practice does not exist —then scatter should be measured on 
a scale of absolute difficulty, as the distance between the 
easiest item failed and the most difficult one passed Actually, 
as no such measure has evei been used, it is not surprising 
that, to quote Hams and Shakow (6), "reseaich up to now 
has failed to demonstrate clearly any valid clinical use for 
such measures ’ v These authois mention nine methods of 
measuring scatter and conclude that “at the present time it 
is impossible to state which is the best method of measuring 
scatter ” 

Now, in view of the uncertainty about measures of scatter, 
the uses to which they have been put become all the more 
questionable It is common practice to use measures of scattei 
as indicative of epilepsy, psychosis, feeblemindedness, emo¬ 
tional maladjustment, hypopituitarism, etc Harris and Sha- 
kow’s paper (6), m fact, deals with “the possibility of obtain¬ 
ing clinically significant information from numerical measures 
of scatter ” 

Studies which have indicated significant differences in the 
mean scatter foi certain groups do not justify diagnosis of 
a particular condition in the individual case And when one 
consideis that for five papeis that repoit such differences, 
theie are four that do not (6), even differentiation for groups 
appears questionable For indicating the type of condition m 
question, there certainly should be a more refined instrument 
than scatter on a test designed to measuie intelligence 

From a consideration of the various bases for scatter, 
it is obvious that their influence is certainly gieater than that 
of any chance errors Therefore, in thei light of these con¬ 
siderations, the use of scatter as even a crude estimate of the 

307 



EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


measurement erior of the test for an individual is not an 
acceptable procedure 

It is also common piactice among clinical psychologists 
to analyse the particular successes and failuies of an individual 
to give a crude appiaisal of his pnmaiy abilities as well as 
estimates of mental detenoration in these abilities Such 
inspectional analysis is a questionable piactice for the fob 
lowing reasons First, the factoual composition of an item 
cannot be prejudged accuiately Such judgments are frequently 
in complete disagieement with factoi analysis results For 
instance, Wright (15) found that items involving lepeating 
digits backwards did not necessanly involve memoiy ability 
but rather other factois And yet how many failiues on such 
items have been analyzed in lepoits as poor memory ability? 
Secondly, an item may be solved thiongh the use of diffeient 
abilities by different individuals and at diffeient age levels 
Thirdly, items may show fairly high loadings on moie than 
one factoi so that failiue cannot be attnbuted to the lack 
of any one ability Fourthly, such clusteis of items have too 
low a lehabihty to have any real diagnostic value 

In summaiy, it is concluded that scatter is foi the most 
part due to factors inhcient in test construction plus certain 
systematic errors In view of these facts, it appeals that 
the possibility of ever seeming clinically significant information 
from measures of scatter based on age scales in cuirent use 
is slight indeed 


308 



scatter in mental test theory 


REFERENCES 

1 Barber, E R “A Study of Scatter and the Relative Diffi- 
culty of Sub-Tests in the Revised Stanford-Binet,” Master’s 
thesis, University of Illinois, 1938 

2 Burt, C "The Latest Revision of the Binct Intelligence Test," 
Eugenics Review, XXX 4 (1934), 255-60 

3 Carlton, Theodore "Performances of Mental Defectives on 
the Revised Stanford-Binet, Form L," Journal of Consulting 
Psychology^ IV (1940), 61-5 

4 Cattell, R B and Bristol, H “Intelligence Tests for Mental 
Ages Four to Eight Years," British Journal of Educational 
Psychology , III 2 (1933), 142-69 

5 Harriman, P L "Irregularity of Successes on the 1937 
Stanford Revision,” Journal of Consulting Psychology, III 

(1939), 83-6 

6 Harris, A J and Shallow, D “The Clinical Significance of 
Numerical Measures of Scatter on the Stanford-Binet,” Psy¬ 
chological Bulletin, XXXIV (1937), 134-50 

7 Harris, A J and Shakow, D “Scatter on Schizophrenic, 
Noimal and Delinquent Adults,” Journal of Abnormal and 
Social Psychology 3 XXXIII (1938), 100-11 

8 Krugman, Morns "Some Impressions of the Revised Stan¬ 
ford-Binet Scale,” Journal of Educational Psychology, XXX 
(1939), 594-603 

9 Mateer, Florence “Differential Syndromes in Stanford- 
Binet Failures” (Abstract), Psychological Bulletin, XXXVI 
(1937), 508 

10 Mosier, Charles I “Psychophysics and Mental Test Theory 
Fundamental Postulates and Elementary Theorems,” Psy¬ 
chological Review, XLVII (1940), 355-66 

11 Reymert, Martin L and Meister, Ralph K “A Comparison 
of the Original and the Revised Stanford-Binet Intelligence 
Scales,” Educational and Psychological Measurement, I 

(1941), 67-76 

12 Terman, Lewis M and Merrill, Maud A, Measuring Intel¬ 
ligence Boston Houghton-Mifflin, 1937 Pp 461 

309 



educationai and psychological measurement 

13 Thurstone, b L- "The Absolute Zero in Intelligence Meas¬ 
urement," Psychological Review, XXXV (1928), 175-97 

14 Tlmrstonc, b b "A Method of Scaling Psychological and 
Education.il Tests/ 1 Journal of Educational Psychology f XVI 
(1925), 433-51 

15 Wright, R E "A Factor Analysis of the Original Stanford- 
Bmet," Psyihometnka, IV (1939), 209-20 


310 



MEASUREMENT ABSTRACTS* 


Adkins, Dorothy C “The Relation of Pntnary Mental 
Abilities to Piefeience Scales and to Vocational Choice” 
Psycho metrika, V (1940), 316 (Abstract of a paper 
read at the September, 1940, meeting of the American 
Psychological Association ) 

Benge, Eugene J “Wanted More Logic and Less Guess¬ 
work in Hinng Salesmen ” Sales Management, XLVIII, 
No 3 (1941), 18-20 

Companies who have records for many employees, past 
and present, could profitably make an analysis of job require¬ 
ments, It is suggested that a cnterion of job efficiency be set 
up and employees classified accoidingly An outline, based on 
these data, for constructing a rating scale in teims of factors 
which can be elicited at time of application is presented Mini¬ 
mum scores on the rating scale are to be established accord¬ 
ing to scores made by employees It is emphasized that scores 
on such a rating scale should not be consideied alone but only 
in connection with other souices of information m hiring 
employees D A Peterson 

Blackwell, A M “A Comparative Investigation Into the 
Factors Involved m Mathematical Ability of Boys and 
Girls” British Journal of Editiattonal Psychology, X 
(1940), Pt I, 143-53, and Pt II, 212-22 

A group of 100 boys and a group of 100 girls, ages 
ranging from 13^4 to 15 years, were given 10 tests of “mathe¬ 
matical ability” including arithmetical reasoning, analogies, 
three “spatial tests," three “geometric tests," and a test of 
algebraic computation and reasoning The mtercorrelations 
for each sex were factored separately Interpretations were 
attempted on the basis of the centroid matrices and on the 

^Edited bj Prof<*sgor Forrest A Kingsbury 

311 



EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


basis of an orthogonal rotation as designed to insuie a. general 
factor Sex differences weie found “The results of the study 
seem to confum the complex nature of mathematical abil¬ 
ity . ” Harold Bechtoldt 

Coombs, Clyde H “A Factorial Study of Number Ability" 
Psychomctiika, VI (1941), 161-89 

In order to investigate ceitain hypotheses concerning the 
nature of number ability, and, secondarily, the nature of per¬ 
ceptual speed, a battery of 34 tests was given to 223 Chicago 
high school seniors and the data were factoied by the centroid 
method Seven primary factois were identifiable upon rota¬ 
tion Several deductions are made lelative to the interpreta¬ 
tion of the factois and relative to the consistency of the data 
with the hypotheses which weie to be tested (Courtesy 
Psycho meinka ) 

Coombs, Clyde H, “A Cnterion for the Number of Factors 
in a Table of Intercorrclntions ” Psychometuka, V 
(1940), 315 (Abstract of a papei read at the Sep¬ 
tember, 1940, meeting of the American Psychological 
Association ) 

Cureton, E E “Testing in College Personnel Service" 
Journal oj Consnlltng Psychology, IV (1940), 221-24 

A suivey of the purposes of a college peisonnel service 
and of the extent to which available tests lend themselves to 
such purposes The need for an adequately standardized dif¬ 
ferential intelligence battery is emphasized To show progress 
and to give information concerning the pattern of abilities 
and attainments, test scores should be directly comparable 
Suggestions are made on coordinating test production W 
A Varvel 

Dwyer, P, S “The Solution of Simultaneous Equations" 
Psychomctrika , VI (1941), 101-29 

This paper is an attempt to integrate the various methods 
which have been developed for the numerical solution of 


312 



MEASUREMENT ABSTRACTS 


simultaneous linear equations It is demonstrated that many 
of the common methods, including the Doolittle method, are 
variations of the method of "single division ” The most use¬ 
ful vauation of this method, in case symmetiy is present, 
appeals to be the Abbieviated Doolittle method The method 
of multiplication and subtraction likewise can be abbreviated 
in various ways of which the most satisfactoiy foim appears 
to be the new Compact method, These methods are then 
applied to such problems as the solution of ielated equations, 
the solution of gioups of equations, and the evaluation of the 
inverse of a matrix (Courtesy Psychometnka ) 

Dwyer, P S "The Evaluation of Determinants ” Psycho - 
metnka, VI (1941), 191-204 

The numencal evaluation of determinants with a modern 
computing machine is discussed Various methods are pre¬ 
sented and their lelations to each othei aie indicated The 
methods piesented paiallel those developed in the previous 
papers on "The Solution of Simultaneous Equations" Espe¬ 
cially emphasized are the Abbreviated Doolittle and the Com¬ 
pact methods Additional topics include the evaluation of 
partially symtnetnc determinants by means of symmetric 
methods and the evaluation of determinantal ratios (Cour¬ 
tesy Psychomettika ) 

Guilford, J P "The Difficulty of a Test and Its Factor 
Composition ” Psychomettika, VI (1941), 66-77 

A factor analysis of the 10 sub-tests of the Seashore test 
of pitch discrimination revealed that moie than one ability is 
involved One factor, which accounted for the greater share 
of the variances, had loadings that decreased systematically 
with increasing difficulty A second factor had strongest load¬ 
ings among the more difficult items, particularly those with 
frequency diifeiences of Iwo to five cycles per second A third 
had stiongest loadings at differences of five to 12 cycles per 
second No explanation for the three factois is apparent, but 
the hypothesis is accepted that they represent distinct abilities 

313 



EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


In tests so homogeneous as to content and foim, wheie a single 
common factoi might well have been expected, the appeal nnce 
of additional common factois emphasizes the impoitance of 
considering the difficulty level of test items, both m the 
attempt to mteipi et new facLors and in the practice of testing 
The same kind of item may measme different abilities accord¬ 
ingly as it is easy 01 difficult foi the individuals to whom it is 
applied (Couitcsy Psychometnka ) 

Guilford, J P "A Note on the Discoveiy of a G Factor by 
Means of Thuistone’s Centioicl Method of Analysis” 
Psychometnka, VI (1941), 205-8 

A fictitious factoi matnx including 16 tests and three fac- 
tois, one of which was a g factoi, was presetibecl Fiom it two 
typical factoi pioblems, including eirois of sampling, weie 
derived Students in tiaining, without awaieness of the factoi 
patterns, ailived at essentially collect solutions by the use of 
Thiustone’s ccntioicl method with lotation of axes Eirors in 
the calculated factoi matiix weie vciy close m size to the 
sampling enors in the con elation coefficients It is concluded 
that a g factoi need not escape detection by Thuistone’s pio- 
cedures if the catena of complete simple stiucture are not 
demanded (Comtes y Psychometnka ) 

Horst, Paul. “A Non-giaphical Method for Tiansfoimmg an 
Aibitiary hactoi Matiix into a Simple Stiuctuie Factor 
Matux ” Psychomeh tka, VI (1941), 79-99 

The most commonly used method of factonng a matrix of 
lntercoiielations is the centioid method developed by L L 
Thuistone It is, howevei, necessary to tiansloim the cen¬ 
tioid matiix of factoi loadings into a simple stiuctuie matrix 
in order to facilitate the inteipietation of the factoi loadings 
Cunent methods foi effecting this tiansfoimation aie chiefly 
graphical and lequue consideiable expeuence and peisonal 
judgment This papei presents a new method for transform¬ 
ing an arbitraiy factor matux into a simple stiucture matrix 
by methods almost completely objective The theory under- 

314 



MEASUREMENT ABSTRACTS 


lying the method is developed and appioximation procedures 
are derived The method is applied to a matrix of factoi load¬ 
ings pieviously analyzed by Thiustone (Courtesy Psycho - 

metnka ) _ _ 

Hoyt, Cyul “Test Reliability Estimated by Analysis of Vari¬ 
ance " Psychovietnka, VI (1941), 153-60 

A formula for estimating the reliabilrty of a test, based on 
the analysis of variance theory, is developed and illustrated 
The data needed foi the required computation are the number 
of collect responses to each item and the score foi each sub¬ 
ject The results obtained from this formula aie identical with 
those from one of the special cases of the Kuder-Richardson 
formulation The relationships of the new pioceduie to othei 
appioaches to the pioblem aie indicated (Courtesy Psycho- 
meti ika ) 

Karlin, J E “The Isolation of Musical Abilities by Factorial 
Methods" Psychometnka i V (1940) ,316 (Abstract of 
a papei lead at the September, 1940, meeting of the Amer¬ 
ican Psychological Association ) 

Lazaisfeld, Paul F (Guest Editor ) “Radio Research and 
Applied Psychology” Journal of Applied Psychology , 
XXIV, No 6 (1940), 661-853 

This entue number is devoted to 21 articles dealing with 
problems of radio research (including two on magazine adver¬ 
tising leseaich techniques), not separately abstiacted heie be¬ 
cause of space limitations They aie classified into the follow¬ 
ing five groups of papers I “Commercial Effects of Radio" 
(F Stanton, E Smith & E Suchman, M Fleiss, M Erdelyi) , 
II “Educational and Othei Effects of Radio" (S Reid, J R 
Miles, G Wiebe) , III “Piogram Research" (J N Peter¬ 
man, H Schwerin, C Daniel, H C Link & P G Corby), 
IV “General Research Techniques" (E A Suchman fit B 
McCandless, M Rollins, H Gaudet & E C Wilson, D B 
Lucas, R Franzen, P F Lazaisfeld), and V “Measurement 
Problems” (P F Lazarsfeld & W S Robinson, C Daniel, 
W S Robinson, R Franzen) F A Kingsbury, 

315 



EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


Lentz, Theodoic F , and Whitmei, Edith F “Item Synonymi- 
/.ation. A Method foi Deteimining the Total Meaning of 
Pcncil-Papei Reactions,” Psychometnka } VI (1941), 
131-9 

Items have been studied heielofoie for their value as 
elements of paiticulai tests to the neglect of moie funda¬ 
mental lesearch into the multiple potentiality of items This 
aitide pioposes a method of giouping items into “synonymies” 
comprising all of the items which coil elate with a given key 
item These synonymies can be used foi inteipietation of the 
total meaning of the key item (1) by inspection of the con¬ 
stituent items and (2) by conelational study of obtained 
single scoies of individual pci sons The method is illustiated 
by foiu items with intei- and intia-cm'ielaUons! and chaiac- 
tenstics of an ideal backgiound ieseivon of items aie pointed 
out (Courtesy Psychometnka ) 

Maitnii D, R “Mental Tests in Cluneal Practice ” Austial- 
asiau Journal of Psychology and Philosophy t XVIII 
(1940), 144-53 

Hie authoi discusses the pm pose of mental testing in child- 
guidance woik and desciibes the batteiy of tests foi geneial 
intelligence, special abilities and disabilities, school achieve¬ 
ment, peisonality tiaits, and emotional stability in use at his 
clinic, Refeiences aie made to the lecent contioveisy between 
Cattell and Veinon ovei the value of the Binet test 
W A Faivel _ 

McNemai, Quinn “On the Sampling Enois of Factoi Load¬ 
ings ” Psychometnka, VI (1941), 141-52 

The results of tlnee empmeal studies on the sampling 
fluctuation of centioid factoi loadings aie lepoited The first 
study is based on data which happened to be available oil 8 
vanables foi 700 cases and which weLe factoicd to three fac¬ 
tors for subsamples The second study is based on fictitious 
data for 2500 cases which piovided sepaiate analyses on 25 
samples foi each of thiee situations five vanables, one factor, 
five variables, two factors, and six variables, thiee factors, 


316 



MEASUREMENT ABSTRACTS 


The thud study, based on leal data foi nine vauables and 
7000 cases, involves sepaiate factorization foi 25 samples of 
200 cases The thiee studies agiee m showing that the sam¬ 
pling behavior of hist centLOid factor loadings is much like 
that of coirelation coefficients, wheieas the sampling fluctua¬ 
tions for loadings beyond the fiist aie distuibmgly large 
(Comtesy Psychometnka) 

McNemar, Quinn “Moie on the Iowa I Q Studies” Journal 
of Psychology, X (1940), 237-40 

In a leply to the Wellman-Skeels-Skodak leviewflVyc/io/- 
ogical Bulletin, XXXVII (1940), 93-111] of his original 
critique of the Iowa studies on envuonmentally-determined 
changes in I Q , the authoi does not find it necessaiy to modify 
mateiially his pievious cnticisms JV A Vaivel 

McNemai, Quinn “On the Numbei of Factors” Psycho- 
mettika } V (1940), 315 (Abstiact of a papei read at 
the Septembei, 1940, meeting of the American Psycho¬ 
logical Association ) 


Poitei, E K “Catena of a Good Examination" Public 
Health Nwsmg, XXXII (1940), 558-64 

Steps in the constitution of a test me outlined Pnnciples 
foi the constiuction of tests and test items are presented 


Schaefer, Willis C “The Relation of Test Difficulty and Fac- 
tonal Composition Determined from Individual and Group 
Forms of Pumaiy Mental Abilities Tests,” Psychomet- 
nka,V (1940), 316-17 (Abstract of a paper read at the 
Septembei, 1940, meeting of the American Psychological 
Association ) 

Van Steenbeig, N J “Analysis of Mental Giowth of School 
Children” Psychometrika f Y (1940), 314 (Abstract of 
a paper read at the September, 1940, meeting of the 
American Psychological Association ) 


317 



MEASUREMENT NEWS* 


Dr, John C Flanagan has been granted a year’s 
leave of absence from the Cooperative Test Service in 
order that he may accept a commission as a leserve officer 
in the Army Air Corps Dr Flanagan will direct de 
velopmental researches and make piactical applications 
with regald to problems of selection of Air Corps 
personnel, 


The authors of the Chicago Reading Tests, Drs Max 
D Engelhart and Thelma Gwinn Thurstone, are con¬ 
ducting an investigation of the comparability of the 
norms of these tests from foim to form and also their 
comparability with the norms of the Metiopolitan and 
Stanfoid Reading Tests Each senes of forms of the 
Chicago tests was standardized independently in succes¬ 
sive years by administration to pupils in a representative 
sample of 30 Chicago elementary schools Approxi¬ 
mately 8,000 elementary pupils took each form when it 
was given for standardization In addition, two forms of 
the sixth-, seventh-, and eighth-grade test were adminis¬ 
tered in successive yeais to approximately 8,000 Chicago 
high-school pupils, The assumption which was made 
and is now being tested was that norms based on large 

♦Notes for this department should be sent to Dr M W, Richardson, United 
States Civil Service Commission, Washington, D C 

318 



EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


samples of pupils diawn from the same schools should be 
comparable In the current study each of the three forms 
of each of the foui Chicago tests, and the appropriate 
Metiopolitan and Stanford tests, weie administered to 
several hundied pupils in landomized order to control 
practice effect Foi example, the three forms of Chicago 
Reading Test D and the Metropolitan and Stanfoid Ad¬ 
vanced Reading Tests were administered to the same ele¬ 
mentary pupils It is planned to determine the equiva¬ 
lence of the raw scoies and, on the basis of these data, 
to make any necessary adjustments to secure precise com¬ 
parability of the norms 

i 

t" 


Professoi Karl J Holzinger of the University of Chi¬ 
cago has wiitten a treatise on Factor Analysis with the 
assistance of Harry H Harman, Research Associate This 
volume is being published by the University of Chicago 
Press and will be ready this fall Professor Holzinger 
is also joint author of two new monogiaphs on the appli¬ 
cation of factorial methods The collaborating authors 
are M A Wenger, Frances Swmeford and Harry H 
Harman These monographs will also be published by 
the University of Chicago Press 


Wright Junior College in Chicago has recently begun 
a three-year study on the evaluation of terminal educa¬ 
tion The study is a part of a comprehensive investiga¬ 
tion of junior college terminal education being earned 
on in nine selected junior colleges by the American Asso¬ 
ciation of Junior Colleges with a grant made by the 
General Education Board 

The Wright study will attempt to evaluate the pres¬ 
ent terminal general and terminal occupational programs 

319 



MEASUREMENT. NEWS 


offered at that institution One of the purposes of the 
study will be the development of techniques of evaluation 
foi use byothei schools 

An extensive measuiement progiam will be initiated 
in Septembei foi the incoming fieshman class Measure¬ 
ments will be made in twelve areas effective thinking 
command of skills and undeistandings in the majoi cul¬ 
tural aieas, functional undeistanding of the basic facts of 
health and disease, mtciests, appieciations, consumer 
competence, occupational efficiency, personal-socio adapt¬ 
ability, socio-civic consciousness, attitudes, worthy use of 
leisuie, functional philosophy of life Several measuring 
devices now available will be used as well as others which 
aie now being developed as a pait of the study Those 
in the foimer category include the Coopei ative Test Serv¬ 
ice Geneial Cultuie and Contemporaiy Affairs tests, the 
Ruder Preference Record, and two of the tests of the 
Progressive Education Association on interpretation of 
data and nature of proof. 

The study is being conducted by a group composed of 
Dean William PI Conley, Bernard Gold, Alice Griffin, 
Max D, Engelhart, and Leland Medsker. 


Soon to be published by the Social Science Research 
Council is a monograph on the prediction of personal 
adjustment. The text has been prepared by Dr Paul 
Hoist of Pioctei and Gamble The monograph will deal 
with personal adjustment in connection with vocations, 
schools, marriage, and ciiminal recidivism. 


320 



EDUCATIONAL AND PSYCHOLOGICAL 

MEASUREMENT 


Volume I OCTOBER, 1941 Number 4 


Cumulative Tesi Records Thlir Nature and Uses 323 

At thm E Traxlei 

An Anaeyiical Description of Studena Counsi-ling 341 

E G IVtlhcimson and E S Boidm 

A Composition Tesi for Fori ign Languagi s 355 

Lawtence Audi us 

Performance Testing in Public Personnel Si lection, 

Pari II 365 

Sidney IV Koran 

Tiil Value or Intelligence Quotients Out vined in Second¬ 
ary School for Predicting College Scholarship 387 

L D Mat (son and A , J Spiow 

The Thursione Menial Alliums Tests and College 
Marks 399 

Maty Loa Ellison and Haiold A Edffeiton 

A Short Cut in thi- Estimation of Splii-Halvls Co-Li n- 
cients 407 

GhailesI Master 

Measurement Abstracts 409 


Indfx for Volume I 


1L1 



Copyright, 1911, by 

SCONCE ELSLAIlCJi ASSOCIAfl S 


statement or THE ownership management, circulation etc required dv the 

ACTS OF CONGRESS OF AUQUbT 24 1012 AND MAnoH 3 , 1933 
Qf I DUG VPIOVAT AND PRYCUOl OMCAL Ml ASUHLMFNf 
Published Quarterly Qt ClilcHffo III Tor October 1, ID 11 

Stale of Illinois ) 

Lornily Hi Cook J iS 

Hoforo mo A Notary rublle In and lor Iho SLnlo ami comity nforcjmld personally appaarcd F 0 
Jensen vho haring boon duly sworn according lo law dopoiFB ami inya llini |n> is iiio Durness 
Monger or llio 1 rtucnllonnl and I’dKlioloBlcal Measurement and that Ilia lolloping Is lo the best of Lila 
knowlcdgo ami Pallol h imo ililcmoit of Hid ounorahlp management {and U a dully paper lIid 
ulreuhlltm) ole of Llio aforesaid publication for tin) dnle shown Jn llio abate cnpllon repaired by |ho 
Ad of August 24 1012 ns amended by llio Art or March 9 103d embodied In sccMoii D37 Foslsl laws 
and Regulations printed on Urn rmerso of tills ronn to wit 

1 Thai, llio imoa and aiWreiBes of the n M&liei cillloi lnanaBtafl odUov and business rnnnsom 
aro Publisher Bdonco Research Associates, 1700 Prulrlo Avenue Chicago 1 (IIlor a Frederic kuder 
1T0D Pralrlo Avonuo Chicago Mnunnlnu 1 dltop Louis I himui 1TI10 Prulrlo Avcnuo Chicago, Bualnesi 
Monnser T 0 Joilscn 170D l'mlrle \veniio, Chicago 

2 Tliul llio owner Is (If owned by I enrimrallon Us name und afldresa must ba sliUod and also 

Immodlatoly Hioroundcr Hid names and nddtcsses of stockholder! owning oi holding oho per coni or more 

of lolal atnoiml ot nock It not owned by a corporation Lho names and addresses or llio Individual 

owners must bo alien H owned by n firm company or oilier imlnrornoraletl concern Its name sad 

address us well ua Ihoso of curb Individual memlior must be then ) Ralph V Hard 208 S LaSalle 

St. Chicago 111 Clmrlcg H lloyd Annlnion Coaled Paper Co Annie Ion Wli II W GLnsnor (Hull 
W 00 111 si Chicago III Airtcd I 1 loin III 20fl y r.iHnllo FA r Chlcii.it. Ill Robert C McNamara 
(121 fl Wabash Aie Chicago. Ill Inlin I flhntv 135 S I aSallc &i Chicago Ill, Lylo M Spencer 

mo PrUrltt Ave Uiltc.RH 111 Mn lloloUiy llwd w/a Itaj 1, Hard 131 *1 IiwSglto Chlcagg 111 
Roy ) Hard IB I ft I nfJalle HI Clilcnno 1H Qeorfla M Hurd II c/o Ralph A Hard, 208 H LaSalle 
St Chicago III Miss Janet Hard c/o ILhIdIi A Hard 208 9 LaSnlle HI Clileago III Robert K 

Burps 17011 Prairie Avcnuo Clileago III Min Grace M W a mi or c/o Richard Wagner, 195 a LaSalle 
Hi Chicago Ill , W C Wlnknl o/o Jlodlno Miff Company llaslno wle 

9 That tlto known bondholders mortgagees nnd oilier sceiirlly holders owning or holding 1 per 
cent or more or lho total mnouLiL d bonds mortgage! or other securities arc (11 Tllero mo nona so 
atnto I None 

4 Thai Llio two paragraphs next above thing Lho nanios of llio owners stockholders and security 
Holdors If any contain noL only Ilia IUL of stockholders and aeciirlly lioUWs as they appear upon lho 
books or llio company but nU° In oases where llio stockholder or security bolder appears upon (he hooks 
ot lha company as liu&lca ot In any all’,nr hiluclwvy iwlailsn lho noma oi lho poison ot mipotallim Tw 
wliooi such ijiiatro Is acting Is given alto that lho mid two paragraphs conlaln siaionianU embracing 
adlant's full knowledge and heller up lo the clrcmuilnncCs and umillllona under which alockliohlara and 
■ccurlly holders who do not Bppoar upon Hid books at Lho company as Irnsleos liohl stork and securities 
In n capacity olhar than Hint ot a bona (Ido minor and Ibis alHnnl has no reason to bellero that any 
ntlior parson, association oi eoTpornilon hns any InLarcaL direct Dl Indirect In llio said stock bonds or 
oilier aocurltloa ihfm aa ao staled by him 

5 That Lho average number of copies or ouch Issiio of this publlrolton sold or dlstrlbuLod through 
llio malls or olhenvlso to paid BUbscrlbors during lho lircho monlln preceding Lho dato shown abovo la 
(Not n dally publication) (This Infoimntloii Is ronulrad Inin dvlly publications only} 

I 0 JEM&EN Ruilnoss Manager 


Sworn Lo and subscribed heroic mo tills 2nd day of Oelober 1011 

aniTRTJDEA TATNL Notary Public 


IflrAL) (My commission esplraa ^cptoinher 21 1015) 


i'rirtud in tkl, united statps of amfhica 



CUMULATIVE TEST RECORDS THEIR NATURE 

AND USES 


ARTHUR E TRAXLER 
Educational Records Bureau 


M OST SCHOOLS now lecogmze ceitain values m ob¬ 
jective tests of academic aptitude and achievement and 
employ such tests to some extent in the appraisal, placement, 
mstuiction, and guidance of then pupils It is generally 
nndeistood that objective tests do not measuie all educational 
outcomes, but studies have lepeatedly demonstrated that they 
do measme ceitain aspects of ability and achievement that aie 
impoitant m the scholastic success of individual boys and 
guls 

It may seem mciely a reiteiation of an obvious point to 
say that the value of a testing program is duectly proportional 
to the natuie and extent of the uses of the lesults by the 
faculty and students Nevertheless, emphasis on this point is 
necessaiy, for not infrequently school authorities administer a 
senes of tests, file the scores, and then give no further atten¬ 
tion to the test data When no improvement is noted, they 
blame the tests when the real fault lies with their failure to 
study the results Tests aie not m themselves remedial instru¬ 
ments, they are tools which can be indispensable aids to diag¬ 
nosis and thus foim an important basis foi the planning of 
instruction and guidance, piovided someone carefully and in¬ 
telligently studies the data which they provide. The analysis 
of the test lesults may to some extent be concerned with 
groups, but it should deal primarily with individuals 

Befoie test lesults can be studied and used to best advan¬ 
tage they must be recoided in some convenient form Alpha¬ 
betical class lists of the scoies and percentile lanks of indi¬ 
vidual pupils, accompanied by sheets showing the distributions 

323 



M)UCA 1 IONAI AND PSYCIIOIOGICAI MEASUREMENT 


of scoies foL various classes, aie veiy useful to teachers for 
puiposes of a quick smvey of the lesults foi individuals or 
gLOups with Lefeietice to icstucted aieas of ability or achieve¬ 
ment They do not, howevei, leadily piovide a compre¬ 
hensive pictiue of the icsults of all tests taken by any one 
individual 

Individual lccoid sheets on which the lesults fLom a single 
testing pioginrn me sumrnanzed in tabulai 01 giaphic foim 
piovide a veiy helpful pictuic of the status of a pupil at any 
one time One can, foi example, admimstei a general achieve¬ 
ment test batteiy such as the Mctiopolitan 01 the Stanfoid, 
plot the achievement profile of each pupil, and thus obtain 
a giaphic lepiesenlation of slieiigths and weaknesses that 
gicatly simplifies the pioblem of diagnosis A giaph of this 
type is illustiated m Figuie 1 One can see at a glance that m 
companson with the giadt nouns, this pupil is stiong in lead¬ 
ing, litciatuie, histoiy and civics, and geogiaphy, but lelatively 
weak in authmetic and spelling 

However, distiibutions, class lists, and diagnostic piofiles 
lesulting from one testing piogiam shaie a common limita¬ 
tion They show status, but they do not show gtoivth Foi 
both instiuction and guidance, the concept of giowth—how 
fai a pupil has conic within a ceitam penod and how far 
he should be able Lo go—is piobnbly fully as impoitant ns the 
concept of piesent status 

Now, theie is a type of iccoul that provides evidence about 
both status at any testing peuod and giowth between testing 
periods This is the individual cumulative lecoid It Is un¬ 
questionably the most valuable aid to the intelligent and effi¬ 
cient use of test lesults yet devised It is to othei kinds of 
iccoids what a motion pictiue is to a snap shot 

The cumulative lecoicl piesupposes a rcgulai, systematic 
testing piogram, If tests aic ndministeied in a school at iricg- 
ulfir inteivals and without definite plan, the value of a 
rccoid of this kind will be gieatly cm tailed, but even undei 
these conditions, it will piobably piovc inoie useful than any 
other kind of record of test Lesults 


124 



CUMULATIVE TEST RECORDS 


HI 


U- 


Educational Profile Chart- Metropolitan achievement Test 

Name JiAaat&ZZ, -S ’VJLLE. ---SCHOOL—, A^o jr 7 fif 

GRADE r 


FormcT 


Us 


fj Score 

r 

Rt-Adintf 

Arith 

-££- 

rneilc 

jZjL_ 

Fng 

Lit 

Hist 

Uftl 

Geo? 

Spell. 

'AACt 

f^l 

j 

1 L 2 

Cbmrr|Vbcali| 

a 

Fund 

4 

Prcb 

5 

e 

7 

IClt'J 

8 

9 

lb lal 

pradc 




ti c*n v 

-■■a s 

SgeH tjgS 

asgss 


flWll 

jU° 3 -g s * w 


*i “ .,3c* 



a.s« s hr * 

IlS^gg ® 

UbH' 

! sk --Sfc?» .. 

3 b| s S d o fl 

lO^ tl i2 H 

sanS®" 6 


5 -1-3 ro” _ ty 

P - 

HH-lia; § 


o S' 5 ri-d r 

d pg ul'oS ^ nn 1 

fjISss g 
il 8 :i 5S 3 


lo P-l 

^ Sc 

B 

5tjE* w 
, 5 ****? P rn 3 

^ 03 Sh _i n O 

d 


o a b> clJ « s S 

5 ?!^-I s § 

a(£ 3 [lj| 3 u «i 


« 3 M-- (h an 2 
sl^sSsst! uu 
s ^i!-|g 

|a^ 3 u |g 

iwms an 1 



u □ j ■' r 1 “ w 


m O jj cj Q 

|a 3311 


-1 

Grtd« 


R 

1 

nadir 

2 

|®3 

Arith' 

4 

f 

6 


7 



CcmpJ 

Par 

V&ih 

furl 

fnk 

UJ 



ftS f 

_ 1 

Wet 

Ls«j 

lUttn. 

Nuni 

- 


*§1- -a 

el!p!|| 

is? ill 


Pupil's Piofile ^Values above Grade 9 0 and below 4 0 
arc extiapolntcd 

Median Profile for Grade 7 in Independent Schools 


Figure 1 


Qlt 9 


an s 


on r 


on q 


Git J 


an 4 


ait i 


an 


an i 


*>n r 


3t \ TLCJT PUBLIC-SCHOOL -SOB31S- FOR INDICATED GEAD£S ( 














EDUCATIONAL AND PSYCHO!OGICAL MEASUREMENT 


E\planation and Illustration of Cumulative Test Records 
Any recoid foi individual pupils that piovides foi succes¬ 
sive additions of the same type of data ovei a peuod of 
yeais and thus makes possible a study of the changes that have 
taken place may be called a cumulative iccoid Thus, a tabu- 
lai auangement of test scoies and pciccntiles by yeais is cumu¬ 
lative in mi tin c Test lesulls cnteied in this way, howevei, 
cannot icadily be appichended quickly, hilt iequne detailed 
study This fact has led many peisons to favoi the graphic 
repiescntation of test icsults Among the vaiious types of 
giaphs thus fai devised, the gnchron peicentile giaph first 
employed in the Amcncan Council cumulative iccoid foims 
stands out as the most widely used type iLs success is no 
doubt due pailly to the fact that it will accommodate any 
kind of test lesult that can be expiessed in leims of peicentiles 
One of the best known adaptations of the Amencan Conn¬ 
ed form I s . the Educational Rccouls Biueau cumulative recoid 
foi independent schools This foini, like the American Coun¬ 
cil foim, is planned for six yeaLs, but it may be expanded to 
include any numbci of yeais Tile lest poilion of this type 
of iccouI is lllustiated in higuies 2, 3, and 4 Let us note in 
some detad the naUuc of the information piovidcd 

The card is divided by heavy veitical lines into bioad 
columns, each of which lepiescnts a giade oi a yeai in the 
life of the pupil The year and giade aie indicated at die 
top of each column 

The fiout of the card is devoted almost entuely to a rec 
oid of class woik and to an extensive test lecoid Since 
the main puipose of this m tide is to discuss test recoids, the 
poition of the sample foims dealing with subjects, maiks, and 
credits has not been Idled out The test lesults aie lepoited 
m both tabulai and graphic form The scores and coi respond¬ 
ing percentiles aie enlued in the table and the peicentde 9 are 
then used as the basis of Lhc giaph, which occupies approxi¬ 
mately the lower half of the caid 

The graph of test scoies is the dealest phase of the lecord 
to one familiar with graphs of this kind, but it often seems 


326 



CUMULATIVE 1ESI RECORDS 



327 





































































rDUCAlIONAL PSYCHOLOGICAL ML* ASUKLMFN1 


somewhat puzzling to pci sons who have had no expenencc 
with it. The peiccntiles along the scale at the left me placed 
according to standard deviations in a noimal distnbution, and 
thus the distance between successive pcicentdes is much smallei 
neai the median than nun the extiernes The median, oi 
50th peicentilc, is marked by the heavy hne going horizontally 
across the graph The symbols at the top—Jy, An, S, 0, and 
so foith—stand foi the months of the yeai The months aie 
grouped accoiding to the school ycai iathci than the calendar 
yeai 

The same peicentile data that aie shown in the table of 
scores me entered in the giaph, except that to prevent oveu 
crowding, the peiccntiles foi the paits of [.lie English test 
have been omitted from the giaph The percentiles used in 
these lecords aie based on lcsults in independent schools, but 
the inteipictation of public-school peiccntile latmgs would 
be made in exactly the same way 

The small dots on the giaph show the placement of the 
vaiious peiccntiles, the dots being identified by the abbieviated 
names of the tests punted neai them Foi example, m Figme 
2, the dot towaid the top of the giaph is labeled “Fiench" 
to indicate that it stands for the percentile on the CoopeLative 
Fiench test The peicentilc foi the pupil's total scoic of 61 
on the Fiench test is 93, and this is indicated by placing the 
dot opposite 93, one of the points shown oil the peicentilc 
scale at the left of the chait In othci woids, the pupil's 
Fiench score was above the scoics of 93 peL cent of the mde 
pendent-school ninth-giade fust-yeai Fiench students who took 
the test in the spiing of 1941, 

The percentile points foi tests that are in the same field 
from year to yeai aie connected by lines, so that one can 
readily follow a pmticulai type of achievement thioughout the 
whole period coveted by the test For instance, one of the 
lines in Figure 3 urns from the arithmetic peiccntile in Grade 
6 to the arithmetic peicentilc in Grade 7, and from that point 
to the arithmetic peicentilc in Giade 8, thence to the ele¬ 
mentary algebra percentile in Giade 9, et ceteia Achieve- 


328 



CUMULATIVE TEST RECORDS 


ment peicentiles are connected with solid lines, academic apti¬ 
tude peicentiles with broken lines, and chronological age per¬ 
centiles with dotted lines 

The lecoid shown in Figiue 2, that of Edwin Martin, 1 
covets only one yeai In many independent schools, a consid¬ 
erable piopOLtion of the tecotds will be ol this type, for the 
number of one-yeat students attending pnvate schools tends 
to be fanly large Even in the case of single-year iecoids, 
it is desirable to record the data on a cumulative record foim, 
foi such a proceduie facilitates comparisons between academic 
aptitude and achievement and makes it possible to summanze 
leadily the student’s test lecoid foi the year as a whole 
Figure 2 shows that in the fall of 1940, Edwin was close 
to, but slightly below average foi his giade in chionologic.il 
age and that he was somewhat below the median m academic 
aptitude, as indicated by the lesults of the American Council 
Psychological Examination, and in leading, as measiucd by 
the Nelson-Denny Test These lesults aie recorded duectly 
below the letter 0, which shows that the data weie obtained 
in Octobei 

The spiing, 1941, percentiles aie entered beneath the let- 
tei A, and thus one knows that the tests weie given in Apni 
Edwin seems to be an able student of foieign language As 
alieady indicated, his French scoie was above those of all but 
7 per cent of the fust-year French students in Giade 9 His 
total scoie on the Latin test fell within the highest thud of 
the independent-school ninth-giade first-year group 

In science and elementary algebia, the boy was above the 
independent-school median but not outstanding His total 
English peicentile and his hteiaiy acquaintance peicentile weie 
below the median but above his academic aptitude peicentile 
In general, Edwin’s achievement test percentiles weie some¬ 
what highei than his peicentiles in academic aptitude and 
reading This is, of course, an encouraging finding, for it 
indicates that, presumably because of application and hard 

rThcse are actual lest rceoids, but the namea of the pupils and the schools 
arc fictitious 


m 



EDUCATIONAL AND PSYCUOIOGICAI MEASUREMENT 


work, the boy’s achievement record neni the end of the school 
year was better than one would expect it to be on the basis 
of the fall test tesults 

Let us now examine a cumulative iccoid coveung several 
yeais The test lecoid of Betty A SLetson, as shown in Fig. 
lire 3, is that of a gill who was a little youngci than the 
nvciage pupil in hei gi ade but who was geneially high in both 
academic aptitude and achievement liei Otis intelligence 
test peicentilcs in Giade 6 wcie exceptionally high Hei 
later academic aptitude peicentilcs wcie a little loweL, but all 
of them wcie significantly above the median foi hei gtade In 
fact, hei Otis scoics in Giades 6 and 7 and hei scoics on the 
Ameiican Council Psychological Examination in Giades 9 and 
10 weie in the highest tenth of the scoics made by the inde¬ 
pendent-school pupils at the same gLade levels, 

Betty’s achievement tesL peicentiles weie, in geneial, some¬ 
what lower than hei peicentiles m academic aptitude, but most 
of them weie in the uppei half of the scoics of the pupils 
m her gLade The only achievement peicentilcs below the 
median weie those foi spelling in Giade 6, geogiaphy in 
Giade 7, arithmetic m Giade 8, geneial science m Giade 10, 
and modem European histoiy m Giade 11 The history scoie 
was the only vciy low icsult in the entile lecoid 

This gill is obviously an excellent reader On the leading 
tests, she maintained a position within the highest tenth of her 
grade throughout the entue peuod 

In a giaphic lecoid of this kmd, giowth in any subject 
piecisely equal to the growth of the gioup as a whole in that 
subject is shown by an exactly horizontal line That is, if a 
pupil improves just as much as the gioup impioves m a yeai, 
he will maintain the same peicentile lating fiom one year to 
the next Lines which go upward, then, indicate gieatei than 
average giowth and lines which slope downward suggest less 
than avciage growth. In interpreting such variations, how- 
evei, one should keep in mind the fact that cveiy test involves 
a ceitnin amount of sampling euoi and that the population on 
which the percentiles aie based is not exactly the same from 


330 



CUMULATIVE TEST RECORDS 



331 


LTIVE RECORD FOR INDEPENDENT SCHOOLS EDUCATIONAL RECORDS EHJHEAU 437 WEST 57th STREET NEW YQRKi 























































KDUCALlONAt (\Nll USYC UClI OGLCiU MLASUUEMlvNl 


year to yc.u So some variation m peicentiles foi the same 
subject is to be expected even when giowth is noimal One 
should also icmcmbci that tests of different subjects m the 
same held—foi example, algcbia and plane geometry ot biol¬ 
ogy and chemistiy—involve somewh.it diffeient abilities In 
these msUnccs, changes m peiccntilc latmg should not be 
interpreted in teims of gj owth 

Mowevci, marked gains ol losses m peicentiles on the same 
test, such as an English test oj a foieign language test, are 
symptomatic, and they may be indicative of a need foi conn- 
scling attention in otdei to find the reason foi the vaiiation 
Foi example, the steady downward tiend of Betty’s per¬ 
centiles in French would seem to require an explanation that 
cannot be obtained from the lecoid itself 

Oil the whole, the scvemyeai recoid shown in Figme 3 
indicates definitely superior ability and achievement A coun¬ 
selor, school pimcipal, 01 college admissions officer familiar 
with this type of Lecoid could decide almost at a gLance that, 
as far as apLitudc and attainment aie concerned, this girl 
should do well even m a highly selective college 

The iccoid of Chailes W Loiing, shown in Figure 4, also 
includes Giades 6 to 12, inclusive, hut it coveis a period of 
eight yeais lathei than seven, since this boy repeated the 
eighth grade The genera! level of the percentiles is in marked 
contrast to that of the peicentiles on the preceding lecoid 
This is an ovei-age pupil who is lather low in both academic 
aptitude and achievement jn companson with the average for 
his grade, Because he is advanced in chronological age, the 
percentiles cm responding to mental age and raw scores on 
intelligence tests tend to be somewhat higher than the per¬ 
centiles foi IQ, but with one exception they aie below the 
independent-school median 

In Grade 6, all but one of Charles’ scores on the New 
Stanford Achievement TesL were distributed below the median 
for independent-school pupils at that glade level The next 
year, most of his achievement tcsL peicentiles went upwaid to 
some extent, but the percentiles foi language usage and anth- 


332 



BlftTHOATE. 















































































EDUCATIONAL AND PSYCIIOIOGICAL MLASUREMENl 


metic weic the only ones above the independent-school median 
foi Giade 7 In the following yeais, neaily all his peicentiles 
weie below the median loi his giade, although some of them 
weie not far below it. 

Theie is some evidence that this boy had Lead lathei 
Widely and that he was a fauly competent teadei Three 
of his liteiatuie peicentiles and three of Jus leading peicentiles 
wcie above the median foi his giade 

Chailcs 1 lepetition of Glade 8 did not significantly raise 
his subsequent lecoul on the achievement tests The only lmc 
that went upwaul noticeably as a lcsult of the lepetition of 
this giade was the one foi eluonological age The whole 
rccoid is that of a pupil who piobably should not attempt 
to cntci the usual hbcial aits college aftei giaduation from 
the sccond.uy school Rathei, he needs guidance into piepaia 
tion for some type of vocation the demands of which aie 
consistent with his mcdioctc scholastic attainments 

It will be obscivcd that there is a gencial consistency in 
the test results thioughout both illnstinlive iccoids The gill 
whose lecord is shown in Figiue 3 was high in academic apti 
tude and achievement in the elementaly school and she main¬ 
tained this supeiiouty ihioughout the secondaiy school The 
boy whose lecoid is contained in Figiue 4 was low in academic 
aptitude and achievement in the elementaly school and this 
low recoid was continued m the secondaiy school In both 
cases, the gcncial level of the peicentile latmgs in Giade 12 
could have been predicted fiom the results of the achievement 
tests taken in Grade 6 

The tendency of the cumulative lccoid of test results foi 
an individual pupil to be in agieemenL fiom yeai to year is 
one of the most nolewoiLhy aspects of this type of record 
This tendency is veiiAed by huncheds of such lecoids which 
have been prepared at the Educational Recoids Buieau and 
other institutions While the peicentiles on an occasional test 
may vary maikedly in successive yeais, the whole picture of 
a pupil's record tends to remain much the same This is 


334 



CUMULATIVE TEST RECORDS 


usually tiue legaldless of tiansfei from one school to another 
01 variation in type of instiuction Comprehensive test rec- 
01 ds obtained even as low as the second or third giade often 
pi edict with lemaikable fidelity the level of achievement a 
pupil will attain in his senioi yeai of high school The fact 
that test lesults distubuted ovei a long period of time tend 
to be positively con elated causes cumulative test histones to 
have exceptional potential guidance values 

Notwithstanding the general tendency just indicated, it 
is tine that a pupil’s test recoid foi an entire year may some¬ 
times be decidedly out of line with his scoies m other years 
When this happens, an explanation that can be made only in 
the light of much othei lnfonnation about the pupil is re¬ 
quited Consequently, a cumulative lecoid of othei kinds of 
data is needed if one is to make an adequate mterpietation 
of the test lecord Foi this leason and many other reasons, 
it is advisable foi schools to maintain cumulative recoids that 
cover not only test lesults but that include home backgiound, 
class woik, mteiests and activities, peisonality adjustment, and 
vanous othei factois The intenelationships of the different 
kinds of infoimation that can be recorded on a form similai 
to the Amencan Council caid aie brought out by the recoid 
for Hany Connelly, as shown in Figuie S 

It is evident fiom Figure 5 that this boy tended to be below 
the independent-school median m his scoies on the Metio- 
politan Achievement Test taken in Giade 8, but that he was 
consistently above the median in scoies on all the achievement 
tests taken in Grades 9 to 12 He was especially high in 
English, literaiy acquaintance, and science The boy’s superior 
test record in the foui high-school grades agiees with his con¬ 
sistently high percentile latings on the academic aptitude tests 
It appears that an explanation of the maiked difference 
between Hariy’s test scoies in Grade 8 and his test scores 
m the later yeais is to be found in the data entered on the 
back of the caid Although he was obviously bright, he was 
lazy and disoideily in the eighth grade and it is probable 

335 



LDUCArlONAl AND PSYCHO I OG1CAI MKASUKFMEm 



336 










































































CUMUIATIVE iLST RECORDS 


thathe lacked both the piepaiation and the mteiest m the test 
itself that would be lequued foi high scoies on the Metro¬ 
politan test, 01 any othei test of geneial achievement Theie 
was much impiovement in the boy’s behnvioi and attitude 
in the ninth giade and in subsequent glades, and consequently 
Jus achievement mcieased until it was piopoitionate to his 
ability The whole pictuic is that of an intelligent, able boy 
who was immatuie in behavioi and attitude but who became 
much moie matine dining the secondaiy-school yeais At the 
end of the secondary school, he unquestionably had the ability 
and the piepaiation foi bettei than aveiage college work 
Experience indicates that a lecoid of this kind furnishes a fai 
bettei basis foi piognosis of college success than is piovided 
by a tiaiiscnpt of ciedits and an admission fmin filled out 
by the school when the pupil is neai the end of his secondaiy- 
school com sc 

Cumulative lecoids m teims of percentiles aie not the 
only kind of graphic lecoid of test lcsults that can be used 
If the results of all tests employed in a school's piogram 
aie expiessed in teims of standaid scoies, Scaled Scores, or 
some other comparable unit, the data may be giaphed on that 
basis Such units aie sometimes piefeiable to peicentiles for 
puiposes of showing growth and if they take their ongin from 
a common standaidization gioup, as do the Scaled Scores 
of the Cooperative Test Seivice, the influence of the selective 
factoi found in ceitain subjects, such as the foreign languages, 
is obviated 2 

It should be cleaily undeistood that cumulative records 
of test results can be kept without the use of any graph what¬ 
soever The prepaiation of the graphic pait of the lecoid is, 
of couise, a time-consuming cleucal job While it is a dis¬ 
tinct aid to interpretation, schools in which the time and cost 

2 For an illustration of a cumulative record baaed on Scaled Scores see John 
C Flana^nn, The Cooperative Achievement Tests A Bulletin Reporting the 
Basie Pi maples and Piocedmes Used in the Development of Their System of 
Scaled Scores, p 37 New York The Cooperative Test Service, December, 1939 

337 



EDUCA110NM AND I’SYCIKJI OGICAL MEASUREMENT 


of the giaph would be prohibitive can maintain usable test 
lecoids in tabulai foim The main thing is to lccoid the data 
in organised fashion so that tiends can be discerned 

Uses of Cumulative Test Records 

As alieady indicated, the uses that aic made of cumulative 
test lecoids depend hugely on the intucst, initiative, and 
undeistanding of the administration and faculty in each local 
school Among the possible uses of such records aie the 
following 1 

1 Counselois may use cumulative test lecoids m con- 
ferung with pupils and guiding them lowaid educational and 
vocational choices consistent with theii ability and achievement 

2 Teacheis may study them m older to plan their instruc¬ 
tion to accoid with the aptitude, knowledge, and undei stand¬ 
ing of the individuals in then classes 

3 Admmistiatois and pcisonnel officers may lefer to 
them when confenmg with paicnts about their children 3 

4 Pnncipals and guidance diiectois may take them into 
consideration when recommending giaduates to colleges or to 
prospective employers 

£ College admissions officcLs may use them as one type 
of evidence on which decisions about admitting applicants are 
based. In oidei to conscive the time of the college admis¬ 
sions office!, the school should of couise include a paiagiaph 
of interpretation when the record is sent to the college Ad¬ 
missions officeis expect to icccive fiom the school an estimate 
of a candidate^ fitness and they will place more ciedence in 
the estimate when it is based in part upon tangible information 

6 Schools may employ them in placing transfei pupils in 
couiscs to which they are suited 

7 Adnunistialive officeis and department heads may use 
them in sectioning classes on the basis of ability 

a An excellent discussion of tine type of use is given in Robert N, Hilkert, 
"Parents and Cumulative Rccoids,” Educational Record, Supplement No 13, pp 
172-83 Washington, D C, Amcncnn Council on Education, January, 1940 

338 



CUMULATIVE TEST RECORDS 


8 Remedial teachers may consult them in selecting pupils 
foi special lemedial woik and in planning that woik 

9 Psychologists and psychiatrists may turn to them foi 
leads in diagnosing peisonality maladjustments and planning 
tieatment 

10 Superintendents and pnncipals may make limited use 
of them in appiaismg the woik of the school and introducing 
modifications This type of use should be caiefully thought 
out and cautiously applied 

11 Counselors and teacheis may lefei to them as a means 
of stimulating pupils to do then best woik This is a legiti¬ 
mate use if the compatison is dnectly with the pievious lecord 
of each pupil and only indnectly, or not at all, with that of 
othei pupils 1 

12 Finally, the entue faculty may employ cumulative test 
iecoids in developing what is peihaps ihe school’s most im- 
poitant function—the planning foi each pupil of a piogram 
that is suited to him and the individualization of instruction 
in accordance with such a progiam s 

The Ameucan Council cumulative lecord foi ms, from 
which the card used in the jllustiations m this aiticle was 
adapted, aLe now being revised 0 A tentative draft of the 
levised high-school fonn is ready and it will be tried out soon 
in several public high schools Changes have been made in 
vaiious paits of the lecoid to aceoid with modern trends in 
education It is significant that in the levised form the test 


Vrh.c use of test recorda in pupil self-uppraisal is described m Richard D 
Allen, Self^Mcitsui email PtojccU in Group Guidance, Inor Gioup Guidance 
Senes, Volume III (New York Inoi Publishing Company, Inc, 1934), xvm-|- 
274 

fi Sec Ben D Wood, “The Need for Comparable Measurements in Indi¬ 
vidualizing Education,” Educational Record , Supplement No 12, pp S 13 
Washington, D C American Council on Education, January, 1939 

fl The revision of the American Council cumulative record forms is being 
done by a subcommittee of the Committee on Measurement and Guidance of 
the American Council on Education The chanman of the subcommittee is 
Eugene R Smith, Beaver Country Day School, Chestnut Hill, Massachusetts 

339 



LDUl'M JONAI ANJ) PSYOUOJ OGIC’AI Ml’ ASURUMENT 


section continues to be one ot tbe most impoitant aspects of 
the lecoid Any foi waid-lookmg cumulative iccoid, legaid- 
less of whethci it is devised by ,m oiganization of national 
scope 01 by a local school system, will inevitably include a 
thaiough test lecoul, foi it is becoming geneially lecogrnzed 
that .1 piuequisiLe to an adequate piogiam of guidance is a 
Lompiebcnsivc, systematic testing piogiam 


340 



AN ANALYTICAL DESCRIPTION OF STUDENT 

COUNSELING 1 


E G WILLIAMSON 

and 

L S BOUDIN 
University of Minncaotn 

T HE OBJECTIVES of the student counseling progiam 
at the UmveiMty of Minnesota have not changed funda¬ 
mentally since its inception in the Aits College in 1923 The 
aims of the gioup of Aits College faculty counselois as stated 
in 1928 by Pateison weie "Fust, to bung about a more 
haunomous adjustment oi individual students to the oppor¬ 
tunities available within and without the Univeisity, and sec¬ 
ond, to establish, as far as possible, a friendly and constructive 
relationship between individual membeis of the faculty and 
students desinng such contact ” (2 265-266) 

Subsequent tiends m the Univeisity’s curricular oigamza- 
tion and the developing facilities foi peisonnel woik have led 
to a gieatei diffeientiation of function within the total coun¬ 
seling progiam 2 The uicieasmg complexity and the conse¬ 
quent piofessionalization of ceitam types of counseling re¬ 
sulted in the establishment of the University Testing Bureau 
among othei specialized agencies foi the tieatment of studenL 
problems This Univeisity-wide counseling agency is both 
coordinate find cooidmated with the counseling agencies of the 
sepaiate colleges within the University 

iAasistancc in the tabulation and summarization of materials was provided 
by Minnesota work projects under project 6714, sub-project 85, sponsored by 
the University of Minnesota 

2 For an historical treatment of these developments see E G Williamson 
and T R Sabin (Minneapolis Burgess Publishing Campari}, 1940) 115 pp 

341 



lducaiional ANI) psycho I ogical mlasuremlni 


The Testing Bui can, in its function as a counseling 
agency, 3 piovides pioftssionah/ed educational and vocational 
guidance supplementaly to the seivices of othei peisonnel 
depaitments on the campus Counseling is pertouned cm 
an indivjdimIi7Ld basis, the counseloi using infoimation from 
tests, icpoits fiom othei pci bonne! woikeis on the campus 
and from community and high school agencies, and fiom 
clinical inteiviews with the sLudcnt 

In a senes of papeis published lecently, the authors 
tiented the pioblcm of the evaluation of these counseling 
sei vices The hist m the senes piesented a systematic anal¬ 
ysis of expci rnienial methods as applied to this type of coun¬ 
seling (4) Based upon out conclusions with lcgaid to 
method, two expciimcnUl evaluations of this counseling weie 
lepoited 

The fust of these expcnmcnLs (5) investigated the lela- 
tive adjustment ol students who chd and did not coopeiate 
with the counseloi The cnLcu.i of adjustment and coopeia- 
lion weie judgments by woikus who had not been involved 
in the counseling piocess and weie based on leadings of the 
case histoiy and follow-up inteiviews The lesults showed 
that students who coopeiatcd wclc moie likely to he adjusted 

The second cxpci unent (6) tested the hypothesis that 
students counseled by the Buieau would be bettci adjusted 
and moie successful academically than students who had not 
been counseled by the Buieau oi any college counseling agency 
This hypothesis was found to hold foi the compauson of a 
counseled with a matched non-counseled gioup of fieshman 
Arts College students The cntena in this study were judg¬ 
ments of adjustment and cooperation and aveiuge giade 
achievement. 

Futme piogiess m counseling of this natiue will depend 
upon knowledge of the lesouices and techniques utilized by 
tile counselors, the Lypes of pioblems dcalL with, and the 
effectiveness with which these probLems weie handled, In 

a The Bureau also functions as a Umveisiiy-wide testing agency and ns a 
locus for rescmch in icsung and counseling 

342 



ANALYTICAL DESCRIPTION OF SIUDENT COUNSELING 


this papei we aie concerned with giving a lepiesentattve pic- 
tine of the lesoiirces and geneial techniques utilized in the 
Univeisity Testing Buiean ovei the penod hom 1932 to 
1935 

Analyses of faculty counseling in the Arts College (3, 8) 
and an exploiatoiy analysis ot Testing Du lean counseling 
(4 253-260) foim the backgiound foi the piesent study 
The exploiatoiy study of Buieau counseling was based on a 
sampling of 196 student cases, analyzed as to ongin, class, 
and college The lepiesentativeness of these cases in teims of 
high school scholaiship and college aptitude scoie was deter¬ 
mined Summaiies weie piesented of the kinds of case data 
used, the agencies consulted oi lefeiied to foi diagnosis and 
treatment, the types and fiequencies of student pioblems, and 
the general counseling techniques used 

The piesent study is designed to amplify the descuption of 
Bureau counseling from a much bioadei sampling of the 
total case load A total of 2053 student cases, the bulk 
of students who came in loi complete counseling services over 
the period fiom 1932 to 1935, foimed the population for 
this suivey 1 The actual case histoiy folders, including rec- 
oids of counseling inteiviews, weie analyzed, and the piesence 
of certain items of infoimation tabulated No questionnaues 
filled ouL by the students were used foi this analysis This 
study, theiefoie, piovides an answer to the question, “What 
is counselingin terms of the judgments of counselois made 
m teims of paitioulai students and not of students m general 

Of the 2053 cases, 1223 students were men and 830 
weie women Classified according to year m college, theie 
weie 617 pie-college students (leeent high school giaduates), 
721 freshmen, 482 sophomoies, 143 juniois, 54 semois and 
36 graduate students By college the distubutions were Gen- 
eial College, 428, Aits College, 1038, pie-college who did 
not matriculate In the Univeisity, 197, Chemistry-Engineei- 

*By "complete counseling services" is meant testing end extensive inter¬ 
viewing The Buieau also provides many types of testing acmees for members 
of the student peisonnel staff of the University 

343 



LDUCA'IIONAI AND PSYCHOLOGICAL JvlLASURLMLNT 

ing-Mmcs, 133, Agncultuie, 79, Education, 48, Business, 34; 
Medic.il-Dental-Phaimacy, 23, Gi aduate School, 36, Nius- 
ing, 18, and Umvmity College, 5 

Ongm of the Cases 

The efficiency of a counseling piogiam must, in part, be 
mcasuicd on the basis of its mtegiation and cooidmation with 
othei peisonnel funcLions This ciitenon implies that coun¬ 
selor at vanous levels of specialization die aware of the 
limits of then lunclions and aie making use of the seivices 
of specialized peisonnel woikcrs thiough the medium of 
icferial 

With this in mind the ongm of the cases counseled in the 
Buieau becomes pci Uncut to its efficiency Obviously, the two 
mam catcgoucs of ongm aic icfened and voluntaiy Over 
fifteen yeais of expenence in the counseling piogiam at Min¬ 
nesota have shown that the best lesulls can be achieved when 
the student comes volmitauly to the counscloi oi seeks assist¬ 
ance at the suggestion, but not command, of some member of 
the Umveisity staff or student body Of the total of 2053, 
1069 of the students were classified as voluntaiy cases Actu¬ 
ally, of the 984 lemaimng students classified as icfeired cases., 
m only 93 cases was icfcnal made by University officials 
in the spmt of piessme These weie students of low scholai- 
ship who had been lefeuetl to the Buieau as pait of pro- 
ceduies involved in scholastic discipline A total of 791 stu¬ 
dents had been lefcncd to the Buieau foi testing and coun¬ 
seling aftci inteiviews with a college counselot oi faculty 
rnembei In addition, 100 students had been lefeued by high 
school counselor oi community wclfaie agencies 

The laigest pioportion of the voluntaiy cases, 892, came 
to the Buieau aftei having heaid about its seiviccs through 
bulletins, class lectuies, fiicnds, oi iclativcs In addition, 122 
students weie told about the Buieau’s scivice in the Regis¬ 
trars office and 55 had learned about it fiom high school or 
college teacheis other than counselor, 

What distinguishes these students coming by way of vari¬ 
ous campus and community agencies? Analyses of thr van- 


344 



ANAIY1ICAL DEbCRIPlION OF blUDENT COUNSELING 


ance (1 Chap V) m high school peicentiles and college apti- 
'tude test scoies give a paitial nnswei to the question Re- 
feiied students tend to be lowei than voluntaiy students in 
both high school achievement and college aptitude (F values 
of 24 08 and 7 6 59, lespectively, both 'well beyond the one 
pci cent point) While theie aie significant vanances be¬ 
tween lefened and volnntaiy cases, f, t” tests (1 97) le- 
vealed that the vanation of the sub-categones in college apth 
tude was homogeneous within each of the two main divisions 
This means that there weie no l eh able diffeiences in ability, 
eithei among students who had been lefcired by a faculty 
counseloi, a college official, oi a high school counseloi oi 
welfaie office!, oi among students who came yoluntauly as a 
result of contact with high school oi college faculty, the Reg- 
istiai’s office, oi some infoimal souice of infoimation about 
the Biueau’s seivices 

In the case of high school achievement, diffeiences do exist 
between students refeired by high school counselors or wel¬ 
faie agencies and students lefeired by college officials, college 
counselors, oi faculty membeis The relations found between 
type of pioblems and oiigm of case tend to clarify the pic- 
tuie Students who have been lefened by high school coun- 
selois or welfaie agencies aie more likely to have financial 
and health pioblems and aie less likely to have vocational 
and educational problems Thus we may conclude that stu¬ 
dents lefened fiom these souices outside the Univeisity aie 
likely to be students with the financial oi health pioblems 
lefened because they were good students in high school and 
had well-developed vocational goals 

The lelation of type of pioblem to origin of case also 
gave indications that the othei two types of refeired students 
tend to have fewei financial pioblems and that students who 
came voluntarily weie more likely to have vocational pioblems 
and less likely to have health problems 

Types and Frequencies of Problems 

Before more adequate evaluations of counseling can be 
made,ithe types and frequencies of problems encountered must 

345 



EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


be clescLibed Then only can the foundation be latd for more 
piecise expeiimuitation in which treatment methods aie dif¬ 
ferentiated according to theii lelative value for various types 
of pioblems 01 problem constellations 

The scope of this papei is limited to a geneial description 
of counseling Anothei papei is being piepared which will 
descnbe how these pioblems clustei foi the students counseled 
and with whnt chaiactenstics these clusteis aie associated 
This is conceived as the fnsL step towaid evolving a symptom¬ 
atology foi use in counseling 

TABLE 1 


TYPES 01 SfUDENl PHOULTMS AS RrCORDLD IN CASL IIISfOJULS 


A 


Frequency 

of 

Occurrence 


Financial 

1 Need oi desire foi p.ut lime work, scholarship or 

loans, inadequate finani-cs 4+3 


U. Vocational 

1 Poor aptitude for chosen vocation 270 

2 Inability to decide between two or more vocational 

choices . 756 

3 Definite choice but wants confirmation oi encour¬ 
agement , , , 976 

4 Definite choice lutt in doubt about aptitude 116 

5 Definite choice but based only on influence of 

family, friends, etc . 108 

6 Dearth of interest in any vocation 42 

7 Information needed about occupations m general 66 

8 Vocational choice without adequate self-analysis 60 

9 Inadequate inhumation in regard to professional 

choice , . 50 2444 


C Educational 

1 Poor aptitude foi college wotk . . 292 

2, Selection of coilisc in line with occupational choice 632 

3 Infcnonty in academic skills such as leading, study 

habits, English usage, cLc, . 446 

4 Understanding guiding standaids, , 15 

5, High general aptitude and poor scholastic achieve¬ 
ments , , , 184 

6 Understanding responsibilities in college 134 

346 




ANALYTICAL DESCRIPTION OF STUDENT COUNSELING 


7 High aptitude hampered by standard curncula 20 

8 Outside work inteifenng with studies 71 

9 University entrance Witliout piopei requirements 48 1842 

D Social, Personal, and Emotional 

1 Too much social life or too many social activities 43 

2 Inadequate participation in extia-curricular activi¬ 
ties 23 

3 Selecting student activities in line with interests 8 

4 Social personality traits which may hinder piofes- 

sional success , 115 

5 Need for encouragement and self-confidcnce 230 

6 Social timidity 89 

7 Emotional disturbances 172 

8 Family domination in vocational choice 78 

9. Conflict with family o: friends 91 

10 Parental anxiety for a wise vocational choice 52 

11 Feai of intellectual inadequacy 34 

12, Idealization of a profession 17 

13, Ovei-evaluation of a college degiee 17 969 

E Health and Physical Disabilities 

1 Senous physical disabilities 107 

2 Easily fatigued 27 

3 Inability to do justice to woik because of intermit¬ 
tent illness 35 

4 Physical habits, diet and sleep, etc 9 178 


5876 


Table 1 shows the distribution of the 3876 problems found 
m the case lecords of these 2053 cases We see that about 
two-thirds of the problems of the students were of an educa¬ 
tional or vocational natme The most fiequent vocational 
pioblems weie found among cases of students unable to decide 
between two or more vocational choices or who wanted con¬ 
firmation oi encouragement in making a vocational choice 
The most typical educational problems weie those of selecting 
a training piogram appropnate to the vocational choice and 
those due to mfcuoiity in such academic skills as leading, 
study habits, and English usage 

Social-personal-emotional problems were the next most 
frequent types of problems Modal sub-types were less marked 

347 



DDUCAl'IONAL AND PSYCHOLOGICAL MLAbUREMENl 


heic The three most frequent types of social-persoiial-emo^ 
tionn.1 pioblems were need foi encoiuagemcnt and self-con¬ 
fidence, social personality trails which may hinder professional 
success, and emotional disturbances, foi ihe most pait of a 
non-psyehiatnc nature 

Health problems wuc infrequently found among these 
cases, 178 problems being discovered and repoited to the 
Buicau by the Umvcisity Student Health Seivice 

The Infoi mahonal Basis of Counseling 

The progiess of counseling as a discipline lias been char¬ 
acterized by a departme fiom "gold buck" methods of judg¬ 
ing abilities and cliaiactci The tiend is towaicl a grcatei 
reliance upon the systematic collection of data about the indi¬ 
vidual by means of standardized tests, repoits fiom othei 
people who have li.ul contacts with him nuclei divcise condi¬ 
tions, medical incur els, and so on The only vestige of early 
counseling methods is the Intel view This is still the most 
vital p.uL of the counseling piocess, but is now the melting pot 
in which the student and the counselor integiate information 
to diaw out a unified picture of the individual and to plan the 
next steps in adjustment 

In Tabic 2 we picsent a summary of the number and fie 
qucncies with which each source of data was consulted hy the 
counselor If frequency and source of data are grouped to¬ 
gether, 27,866 units of data were used as the basis foi coun¬ 
seling The most fiequcnt source of information was voca¬ 
tional and educational tests given in the Testing Bureau 
Clearance slips fiom the Faculty-Student Contact Desk (6 83) 
provided information in 2038 cases 

Othei important souices of data weic Health Service 
repoits, University Kntrance Tesl rating, and grades fiom 
the Registrar's oflice The fact that reports from family or 
other relatives weic the least frequent sources of information 
could be taken as an indication of the need for a study to 
determine whether a social worker would be a valuable addi- 


348 



NUMBER AND SOURCE OF DATA 


ANALYTICAL DESCRIPTION OP STUDENT COUNSELING 


ps^H°D 

bih a pd<\l 

^ M ^ H r-l 01 (N N H H 

35LO ipLlJ IH L'JLQ 
JO ON JO b3Tj 

i/j N N IMS CO □ [«5 \D ^ f 9QttU)HU\oM tn VDcn^nHH 

rt VO Cfl ^ M tn q N h H 

cn h w 

bioipo 
ujojj tuods^j 

vn rf- k n H . . 

4 rl ^ ^ 

jojabiino^ ^[nouj 
g II iiioJj siToday 

o 

i-i 

bU013B[3)I J31P0 10 

XjIUIDj UIOJJ B}J0d3^ 


S(!I!3Q 333|[03 
UI01J BIlodDH 

M m M w m 

M n ^ 

W 

uoibbiuipv 
S3jiiu[g iioiji!3i[(Idy 

pm 

IM 

sjpiBay 

isdjl 

jEioadg 

« H H 

CM 

pH 

1 

1S3Q^L]U03 
juapms-Ajinanj 
HfOTj di|g Twa|3 

1 

IbOVONHNOvt 

N « N H MH 
^ 1^. cn >h 

fixn^q 

H3AI0 ON 'blsaj^ 
3npjj pilli JEHOIJIDO^ 

l><nWOMv£,viHc^'£3nK&5o:\0»rtvow(tivicnH 
HtnM\QiA^ HW vOHyi^MMrtrt 
pH M tr^ cn p,j H rt 

33 !{J0 VEUbiSay 

U1QTJ S3pt?T£) 

M O M H M i-H 

h \o ^ 

CO W 

3niiB^i 
)S3J ( aOUUJlU'] AlUJJ 

^ w, ^ 50 ^ rH 

o H »o u-i in 

H *t N rf 

iH 

33TAJ3S W 3 H 
uiojj sjiods^ 

Wl SO ^ H 

^ <7s 

O 

i-i 

38BO 
ipL’3 Ul B]EQ 

11 jojaquinN 

hhhhh 


349 


Total of 
each Type 

of Data 

Collected 1253 5612 1642 14,574 40S>3 133 11 365 9 10 174 27,fl66 



LDUCAIIONAI AND PSYCHOLOGICAL MKASUKLMLN'I 


tion to the Testing Bureau staff It -would be necessary to 
determine how much inhumation would be added and how 
useful that mfoimation would be in diagnosis and treatment 
Counseling Pioicdtucs Classified by Type of Ptoblem 
One of the aiens foi icseaich in counseling which has 
been least exploited is the pi ease descuption of the counsel¬ 
ing inteiview—the relationships between vinous counseling 
pioccsscs and the ellectivencss with which each type of prob¬ 
lem is handled The ultimate objective of such descuption is 
the delineation of these counseling piocesses in terms of 
fundamental psychological categones One possible prelimi¬ 
nary step may be the genetal descuption of what the coun- 
seloi did with rcgaid to vanous types of pioblems This is 
called a gencial descuption because it does not attempt to 
Lake into account the psychological setting within the mter- 
view (eg, the specific attitudes the counselor and counselec 
had Lowaicl each othei at that point) when the behavior 
desciibed occurred 

In Table 3 we picsent a genual descuption of the 
Bureau’s counseling procedures The data show that the com¬ 
monest proeedmes in counseling students with financial prob¬ 
lems took ibe foim of discussing the need for work, discussing 
scholarships and loan funds as a source of money, and dis¬ 
cussing the iclation of part-time work to the student's class 
schedule 

In the treatment of vocational problems, the counselois 
jelied mainly on discussions of aptitude and on advice and 
recommendations of occupational choice on the basis of test 
results Other ficqueut pioceduies include advising voca¬ 
tional “tiyouts” through college courses, descriptions of occu¬ 
pations and advising a gencial background training before a 
definite choice is made. 

With educational problems, the most frequent procedure 
was aid in selecting a schedule of classes in line with aptitude 
In another laigc number of cases the counselors discussed 
couise prerequisites, sequence of courses, find the like 
Attempts at cultivation of interests in studies and scholastic 

350 



analytical description of student counseling 

TABLE 3 

COUNSELING PROCEDURES CLASSIFIED DY TYPE OF PROBLEM 


of 

Types of Problems Occurrence 

Financial Problems 

1 Discussion of relation of pait-time work to class sched¬ 
ule 71 

2 Discussion of need foi work 83 

3 Suggestions of ways of getting jobs 12 

4 Discussion of student’s expenses and financial resources 54 

5 Discussion of scholaislnp and loan funds 78 

6 Letters of recommendation for jobs, scholar ships and 

loans 40 

7 Refenal to emplovment bureau 22 

8 Refenal to financial aid agencies 31 391 

Vocational Problems 

1 Descuption of occupations 232 

2 Refenal to infoimational books 128 

3 Discussion of aptitude , 1120 

4 Discussion of student's financial resources foi occupa¬ 
tional training , 125 

5 Vocational tiyouts through college courses 351 

6 Advice and recommendation of occupational choice (on 

basis of test results) , 1346 

7 Advice of geneial backgiound tiainmg before definite 

choice is made 251 

8 Discussion of method of entering and securing employ¬ 
ment in chosen occupation 120 3673 

Educational Problems 

1 Use of class schedule foi piogram making 4 

2 Discussion of course pierequisitcs, sequence of courses, 

etc 304 

3 Cultivation of inteiests in studies scholastic recoul, etc 121 

4 Evplanation of icci ration method of studv 13 

5 Discussion of special suiloundings conducive to effec¬ 
tive studv 31 

6 Discussion of methods of vorahulaiy building 27 

7 Tutonal aid with specific subiccts 20 

8 Aid m selecting a schedule of classes in line with apti¬ 
tude 642 

9 Aid in budgeting liouis foi studv 49 

10 An nttemnt to analyze cause foi difficulty with a 

snecific subject 38 

11 Evnhnation of student’s low aptitude as cause of low 

scholarship 34 

12 Recommendation of non-college type of vocational train¬ 
ing , 149 





IJlUCAliONAI AND 1’bYCimiOC tICAI MEASUREMENT 

TABLE 3 (Continued) 

Frequency 


_ Types of Proble ms Occurrence 

H Attempt to diagnose and ovctcomc special disability in 
spelling, gi annum, mathematics, etc 25 

If Discussion and explanation of student's lcsponsibility 
in college, giading standauls, etc, 74 

15 Recommend student change course of study 182 

16 Discussion of eligibility foi picscubcd work 97 

17 Referral to "How to Study" instiuctors 62 1872 

Social, Pci tonal and Emotional Pioblems 

1 Warning of over-emphasis put upon Ins social activities 13 

2 Airanging contact with pioper activities such as band 

and debate, etc 1 

3 Suggestion of piopci activities to tryout foi cxtia-cui- 

iiculai interests , 13 

4 Establishing fuendly contact with facility for futuie use 22 

5 General discussion, encouiagcmcnt and assistance with 

pioblcm of self-confidence , 155 

6 Suggested tientment foi specific pci sociality difficulties 28 

7 Discussion of meeting people and making filends 20 

8 Treatment suggesting special things to do It 

9 Suggesting discussion between family and friends over 

mutual conflicts , , 25 

10 Letter oi interview with member!) of family over con¬ 
flicts , , 28 

11 Discussion of worries and other emotional problems 108 

12 Advise tiansfer to anothei school because of home 

environment , • 12 

13 Recommend welfare agency to assist student 2 

14 Rcfenal to Y M, C A foi aid in social adjustment 

of student , 19 

15 Referral to psvchiati ist foi diagnosis, 53 

16 Referral to Speech Clmic for diagnosis and treatment 

of speech difficulties, 31 541 

Health and Plwsical Disabilities 
I Discussion of handicaps , . 105 

2. Advising icmcdinl gym . 8 

3 Discussion of living arrangement sleep, did, etc 4 

4 Athletics suggested for bcttci health 2 

5, Referral to child health clinic , ,, 1 

6 Referral to Health Service foi special health examina¬ 
tion , , ,,, 2 122 



352 



ANALYTICAL DESCRIPTION OF S1UDEN1 COUNSELING 


recoids, lecommendation ot a non-college type of vocational 
tininmg, and lecommendation that the- student change his 
course of study weie othei pioceduies with high fiequency 
The tabulation of the pioceduies used with social-pei- 
sonal-emotional pioblems indicate that the counseloi idled 
on either a general discussion for encouiagement and assist¬ 
ance with the problem of self-confidence 01 discussed wonies 
and othei emotional pioblems The next most fiequent step 
was to lefei the student to the psychiatiist foi diagnosis 
Discussion of the physical handicaps involved was by fai 
the main method used with health pioblems The counselors 
discussion did not impinge upon medical advice but lathei 
upon the lelationship between physical condition and educa¬ 
tional and vocational adjustments 

Summajy 

A general descuption based on 2053 cases was piesented 
as a basis foi analytical descuption of counseling in the 
Testing Bureau of the Univeisity of Minnesota ovei a foul 
yeai penod fiom 1932 to 1935 inclusive 

This descuption enables us to see how well the Bureau's 
counseling service is cooidmated with the geneinl personnel 
piogiam of the Univeisity By bioad delineations of pioblem 
aieas handled, of sources and amounts of data used and of 
pioceduies followed, the authois hope to bieak giound foi 
a much needed basic descuption of the psychological piocesses 
involved in counseling mteiviewing conducted in a non- 
psychiatnc guidance clinic for college students 

What is needed is like desciiptions by other counseling 
services which may be based on the same or other philos¬ 
ophies of counseling Such an accumulation of data should 
lead to even moie specific desciiptions in which lecorded 
inteLviews would piobably piovide the raw data Ultimately, 
it should be possible to determine expenmentally which coun¬ 
seling procedures are most effective with what types of 
pioblems This is the objective of evaluative research m the 
field of counseling 


353 



LDUCAlIONAT AND PSYCHOLOGICAL MEASUREMENT 


RhrLRCNCDS 

1 Lindquist, E I ? Statistical Analysis tn Educational Resea) ch New 
York Houghton Mifflin Company, 1940 266 pp 

2 Problems of College Education Edited by Earl Hudclson Minne¬ 
apolis Umvcisity of Minnesota Press, 1928 449 pp 

3 Williamson, E G "Faculty Counseling at Minnesota,” Occupa 
turns, XIV (1936), 426-433 

4 Williamson, E G anil Hoi din, E S. "The Evaluation of Voca 
tional ami Educational Counseling A Critique of the Methodology 
of Expel iment," Educational and Psychological Measmement, 1 
(1941), 5-24 

5 Williamson, E, G, and Bowlin, E S "A Statistical Evaluation of 
Clinical Counseling," Educational and Psychological Measurement, 
I (1941), 117-132 

6 Williamson, E G and Bordin, E S "Evaluating Counseling By 
Means of a Control Gioup Expeumciit," School and Society, HI 

(1940), 434 40 

7 Williamson, E G, and Dailey, J, G Student Personnel IFotk 
New York McGraw-Hill Hook Company, Inc , 1937 313 pp 

8 Williamson, E, G,, Longstaff, H, P , and Edmunds, J M "Coun 
sclmg Aits College Students," Journal of Applied Psychology, XIX 
(1935), 111-124 

9 Williamson, E, G and S.irbin, R Student Personnel IFoik in 
the Umvetsity of Minnesota Minneapolis, Burgess Publishing Com¬ 
pany, 1940 115 pp 


354 



A COMPOSITION TEST FOR FOREIGN 

LANGUAGES 


LAWRENCE ANDRUS 
University of Chicigo 

T HIS PAPER discusses a type of French composition 
test, developed at the University of Chicago, which has 
m practice yielded remarkably good results as a measuring 
instiument and has proved very stable, i e , has given com- 
paiable scoics fiomyeai to year 

The Lest was developed as a part of the comprehensive 
examination in French 104-105-106, the sequence m Interme¬ 
diate Fiench given in the College of the University of Chi¬ 
cago The College, as the term is used at the University of 
Chicago, includes the yeais corresponding to the fieshman 
and sophomore years of the tiaditional foui-year piogiam 
The prerequisites for admission to French 104-105-106 are 
two units of high school French 01 the successful comple¬ 
tion of Fiench 101-102-103, the sequence in Elementary 
French Students m the College, as contrasted with more 
advanced students who desire to offer Fiench 104-105-106 
as an elective in a field related to their major field, may gain 
credit for the sequence only by passing the comprehensive 
examination given at the end of the Spring Quaiter The 
great majority of students in the course aie College students 
Since these students pass oi fail solely on the basis of the 
compiehensive examination, the staff of the couise and the 
examiner attempt, m every possible way, to make the exami¬ 
nation as valid, as reliable, and as disci iminating as they can 
In the attempt to secuie greater reliability, objective ques¬ 
tions, oi questions which can be scored with high objectivity, 
have been devised 

The Announcement of the College for 1940 41 describes 
Fiench 104-105-106 thus u The primary objective of the sec- 


355 



lnUCAUONAJ ANO VSYCIiniOWCAT MLASUHEMliNr 


ond-ycar sequence is the standardization of the language abilh 
Lies To that end theie is continuous tiainmg u\ foimal and 
mfoimal written and oial expression, auial comprehension 
and the nceui.itc dcteimuution of the value of the punted 
wo id Appioximately twenty-five hundred pages aie read, 
with lepoits, following individual piogiams ” This statement 
is a fail dcscnpfcion of the coiuse as given m the pieceding four 
yeais, the pcuod coveied by this investigation The type of 
test heic discussed was intended to measinc the outcomes of 
limning in wiitten cxpicssion Tt was fust used experimentally 
in the comprehensive examination of June, 1934 1 In substan 
tially its picsent foim, it was included as a pait of the 1933 
examination, and retained in the following years with mmoi 
changes in the physical presentation 

The essential fcaimcs of tins type ol test aie as follows 
a Fiench passage is chosen which, m the judgment of the staff 
and the exannnei, contains matcnal suitable foi testing at the 
level of the coin so, fiom the point of view of both vocabu- 
l,iiy and syntax It should be emphasized that the choice of 
.in nppropiiatc passage is cxtLcmely impoitant, if the test is 
to yield maximal icsults It may be neccssaiy to lead many 
pages bcfoie a suitable passage is located This passage is 
then translated into good English The next step is to go 
tlnough the Ficnch text and delete certain words and phrases 
The corresponding parts of the English translation aie undei- 
lined and numbered to agree with the numbeis replacing the 
omitted wouls and phiases m the Ficnch passage The stu¬ 
dent is icquued to complete the Fiench passage in accordance 
with the English tianslation He is guided in this task by 
the numbeis and the undeilining 

A sample taken fiom the June, 1939, examination will give 
a bcttci idea of the physical auangement of the test than a 
lengthy explanation 

J Stc Erncsl linden and John M Stnlnfiker, “A New lype of Comprehen¬ 
sive Foreign Language Test/' The Modem Language Journal, XIX, 2 (Novcm 
ber, 19H), 81-92 


356 



COMPOSITION msi I OR FOREIGN LANGUAGLS 


Tiausl/iUott of biench Test on Opposite Pag ■. 

The old marquis de la 1 our-Samucl, (!) eighty-inuo yews old, aiose and 
came (2) to lean agonist the mantelpiece He said (3) in his (4) somewhat 
ti emhUnfj voice 

“(5) I, too j know a strange thing, so strange (6) that it has been the obses¬ 
sion of my fife (7) ft is now fifty-six years (S) smee this adventure (9) hap¬ 
pened to me, ami (10) a month doesn’t go by (11) without my seeing it again 
m a dream 02) There has remained to me from that day a murk, an imprint 
of fear, do you unilerstnnd me? Yes, (13) / UTideiwent horrible fright, (14) 
for ten minutes, (15) in svcJt a way thnE since that Horn (16) a bud of constant 
term (17) has ) emained (IS) in my soul (19) Unexpected noises (20) make 
me start, (21) objects (22) that I make out (23) pooily in (24) the evening 
shadow give me (25) a mad desire (26) to tnn away Finally, I’m nfrnid (27) 
at mgh/ 


"Oh l (2SJ I shouldn't have admitted (29) that (30) before having arrived 
it my (31) picsenl age Now I enn say (32) everything It is permitted 
(33) not to be biave before imaginary dangers, when (34) yon are eighty tnoo 
Before real dangeis, (35) / have ncvei tetreated, ladies’* 



LPUCAHONAb AND PfAt’IIOLOGICAI measurement 


Certain words and phrases have been omitted from the following French 
paisagc, and n number lins been substituted for each omitted word or phrase 
Jn each numbered space rU the right, write in FRENCH the appropriate word 
or phrnsc He sure [huL yum translation fits die French context An English 
transition of the p issagc is given on the page opposite, the tunalation of each 
omitted woid or pin use is underlined and preceded by a number which cor¬ 
responds to (lie iitiinliei m the French passage for each m reference NoLe that 
there is not ilwnys exact correspondence in foim between the French and ihe 
English Lhi. old nruiiqms uses the coiivcualtoiiai, that ia, uifatml, style, until 
he begins to tell his smry, which is in iilcnuy sisie 



358 



COMPOSITION TEST FOP FOREIGN LANGUAGES 


A casual inspection of the sample suffices to reveal that this 
test foini makes possible the use of a gieat vanety of items, 
both as to content and as to length The items may all be 
classified nndci one heading usage, with subclassification 
imdei active vocabnlaiy (including idioms) and gtanwiai (in¬ 
cluding syntax) By the ptopci choice of items tested, it is 
possible to vaiy at will both the level of difficulty and the pro¬ 
portion of vocabulaiy items and grammar items In this way, 
validity with leferencc to specific objectives and content of a 
given couise of study may be built into the test For instance, 
in Fiench 104-105-106 at the University of Chicago, a com¬ 
mon piactice has been to lestuct items used in this test to the 
2,500 woids of highest fieqnency in the Vander Beke Flench 
IF old Book 2 A similai procedure may be followed with 
jespect to idioms, by using the Cheydleui Fiench Idtow List 3 4 
Note that both upper and loweL limits may be adopted At 
piesent, giammai items must be validated on the basis of text 
books used and the subjective judgment of the mstiueting 
staff When the Ft ench Synia\ Count, begun nndei the direc¬ 
tion of the late Professor Coleman, and now pLoceeding undei 
the direction of Professor Kemston, is finally available, theic 
will be an objective cm tenon of difficulty for Fiench syntac¬ 
tical constitutions In Spanish, this invaluable aid has already 
been published 1 

Theoretically, the most discriminating type of item for 
use in an achievement examination is one answered coirectly 
by 50 per cent of the gioup taking the examination s Iti 
practice, we almost nevei find a test containing even a majority 
of items of this type, except m the case of standaidized tests 
which have been refined by statistical piocedmes, and even then, 

2 George JS Vander Beke, French Word Book (New Yoik Mncnnllan, 
1929) 

^Frederic D Cheydleur, French Idiom List j Based on a Running Count of 
JJ8JJOOO Words (New Yoik Macmillan, 1929) 

4 Hayward Kemston, Sfamsh Syntax List (New York Iienry Holt & Co, 
1937) 

“Thelma Gwinn Thuratonc, “The Difficulty of a Teat and its Diagnostic 
Value,” The Journal of Educational Psychology, XXIII, S (Miy, 1932), 13S-343 

159 



EDUCA1I0NAL ANI) PSYCH01 OGICAI MEASURLMENT 


pcihaps only with icfeience to the gioup on which the lest was 
standardized The classioom teachci intei csted in using the 
kind of test heicm discussed would obviously have ncithei the 
time nor the statistical knowledge to go through the various 
steps lequued to develop a test composed Iaigely of the most 
discuminating items. A fan approximation can, howevei, be at 
tamed by lemcmbeimg that items that will be passed by prac¬ 
tically .ill the group of students, 01 by almost none, have veiy 
little value for disciimmatiou They might be called "dead 
wood n The experienced teacher and his colleagues can, by 
subjective judgment, identify many such unpiofitable items 
Repeated use of a given tesl foim and inspection of the ic- 
sults (not nccessanly involving a foimal item analysis, al¬ 
though that is always desuable when piacticable), will tend 
to bring the Icachei’s subjective judgment of the woith of an 
item closei to an objective evaluation It goes without saying 
that the value of an item ns tegards disciimination vanes 
with the level of Distinction and the content and method of 
the com sc, and should always be estimated in Lcims of these 
latter. To take a hypothetical example, in one school an 
item involving a paiticulai use of the definite aiticle in Flench 
might be highly discuminating, whereas in <t school in a neigh- 
bonng town, using a different couise of study and a different 
method, the students might have teceived so much dull on 
this paiticulai point that an item involving it would be passed 
by practically every student, and, hence, be of veiy little value 
for discrimination 

If the passage chosen, although otherwise desuable, is 
judged to lack an adequate number of instances of a particular 
constitution considered important by the instructing staff, it is 
frequently possible to add items involving this construction 
by making slight changes in the French passage It may not 
even be necessaiy to change the English translation at all, 
aftei such revision has taken place, The vocabulary items 
can be conti oiled in like manner 

The scoring of this test can be made veiy objective. At 
the time the test is constructed, as complete a key as possible 

160 



COMPOSIIION CLS 1 FOR FOREIGN LANGUAGES 


is piepared, piefeiably by the entue staff of the course This 
key facilitates the woik of the scoiei, who must, however, 
know the language thoioughly The scoring cannot be en- 
tiusted to cleiks Whenevei the scoiei meets a collect 
answci not included ill the key, he adds it to the key Even 
with the necessity foi considelation of such answeis, the 
scomig is veiy iapid In syntactical items, minoi eirois m 
spelling and mistakes in accents aie disicgaided, pLovided 
that the student uses conectly the constiuction on which the 
item hinges The sconng thus becomes neaily as objective 
as that of a multiple-choice test 


TABLE 1 


COMPOSITION IPS'! — 

■ IRPNCH 

104-105-106 



1937 

J938 

1939 

1940 

No of items 

112 

100 

100 

100 

No of points 

No of points m comprehensive 

112 

100 

100 

100 

examination 

545 

495 

485 

485 

Mean 

52 09 

5175 

45 96 

54 52 

Standard deviation 

1780 

J5 95 

14 86 

13 9S 

Reliability* 

Standard cnoi of measuiement 

,94 

94 

92 

91 

(ffn, — m VI — r ) 

Correlation with entile compic- 

4 36 

3 91 

4 20 

419 

hensive examination 

92 

91 

88 

.90 

No of cases 

44 

60 

78 

52 


^Estimated by Kudei-Richaidson formula No 20, 



Table 1 shows the mam lesults of a statistical analysis of 
the different foims of this test used m the compiehensive ex¬ 
aminations m Flench 104-105-106 at the Umveisity of Chi¬ 
cago dunng the fom-yeai period 1937-1940 inclusive 

We note that the mean score in all four yeais was m the 
geneial neighboihood of 50 pei cent of the possible number 
of points in the test This is equivalent to saying that the 
aveuige item was answeied correctly by about 50 pei cent of 
the group talcing the examination In 1940, a few items were 

*See G F Kuder rmd M W Richardson, “The Theory of the Estimation 
of Test Reliability," Fsychomctnha, II, 3 (September, 1937), 151-60 

161 



EDUC/YIIONAL AND PSYCHOLOGICAL MEASUREMENT 


puiposely included which seemed a puon rather easy fen the 
gioup tested, in ordei to test the effectiveness of such a priori 
judgment These items weie answaed correctly by most of 
the students, .ind aie ieffected in the highei mean score for 
the 1940 test 

In each of the foiu forms of the test, the stnndaid devia 
tion was laige enough so that the students' scoics weie well 
spread out, thcieby facilitating classification of the students 
in lank oidei of ment The differences in size of the stand¬ 
ard deviation from one year to anothei arc no gieatei than 
differences often found m administenng the same test to two 
different gioups of students, one slightly moie homogeneous 
than the other The spicad of students’ scoics on this test 
is thus quite compaublc fiom yeai to ycai 

For a test of 100 items, a icliabihty of 90 is commonly 
consideicd good The lowest icliabihty coefficient estimated 
foi the fom-yeai pcuod was 91 for the 1940 fonn, the 
highest, 94 foi the 1938 foim (this is relatively bettei than 
94 foi the 1937 foim, since the lattei contained twelve moie 
items) 

In none of the foui ycais is the standaid enoi of meas- 
uiement as large as one-tenth of the mean scoie, and in none 
is it as laige as one-third of the standaid deviation These 
values me satisfactory low They indicate that chance 
eiroi in measuicment has been kept within icasonable limits 
Note thaL the difference between the highest and lowest 
standard criois of measiuement heie leported is only 45 To 
illustrate the meaning of this slight diffeience between the 
two extiemes, let us assume that a student in 1937 and a 
student in 1938 each have a scoLe of 40 00 The chances 
are two out of three that the true score of the 1937 student 
lies between 35,64 and 44 36, and that the tme score of the 
1938 student lies between 36,09 and 43 91 * 

In only one year, 1939, docs the coirelation of students’ 
scoies on the composition test with then scores on the entue 

v Lfhtois > Note It will be recognized that not nil statisticians would agree 
as to the validity oE tins interpolation 

362 



COMPOSITION TES1 I OR IOREIGN LANGUAGES 


compiehensive examination fall below 90 The 1939 dis- 
tubijtion includes one case (that of a student not legisteted 
for the couise) which is so unlike the other cases in the group 
that it lowers the correlation by at least 01, It is customary 
to legard the correlation of a sub-test with the entire exami¬ 
nation as spuriously high, since the sub-test is being couc- 
lated with a test of which it foims a part In each of the 
foui yeais, howevei, the composition test represents only 
about one-fifth of the total number of points in the entire 
comprehensive examination, and yet the corielation of this 
pait with the entile compiehensive examination remains um- 
foimly high, although a good part of the remaining material 
in the compiehensive examination has changed in character 
from yeai to yeai This phenomenon, taken m connection 
with the lelatively constant mean, standaid deviation, relia¬ 
bility, and standaid enor of measuiement, leads to the con¬ 
clusion that under the conditions prevailing in French 104- 
105-106 at the Umveisity of Chicago this composition test 
lepresents a pai ticularly stable type of measuring instiument 
Eveiyone must agiee that the best method of testing 
Flench composition would be to requiie the student to wiite 
a fiee composition in French, if such tests could be scoied 
reliably Unfortunately, reliable scoring is in practice very 
hard to obtain In the few instances wheie moderate success 
has been achieved, the process requires a gieat deal of time 
and involves essentially using the services of a jury of experts 
Most teachers would probably agree that, at least at the 
lower and intermediate levels, the two elements which would 
assume the greatest importance in their judgment of a free 
composition in French are active vocabnlaiy (including idi¬ 
oms) and gravmai (including syntax) This type of test is 
capable of measuung students' achievement in these two ele¬ 
ments reliably and objectively The wuter feels that at the 
lowei and intermediate levels it is wiser to use a test which 
can do this than to inn the risk of unreliable measurement 
which use of free composition entails 

The results of the statistical analysis shown in Table I 

363 



l'DUCA 11 ON A l AND I’SYCIIOI OGIC’AT MI-ASUKRMLNJ 


can be accepted unquestionably only for Flench 104-105-106 
at the Umveisity of Chicago There is no guarantee, but 
theie is a stiong piesumption, that «i test of this kind, con 
stiucled elsewheic with equal care, and with due attention to 
the objectives, content and method of the course of study, 
would yield equally favoiablc icsultb The technique could 
certainly he applied to Spanish, Italian, and Poituguese as 
well as to Fiench, the sentence sluictuic of Gciman might 
pi event the Lcchniquc fiorn being as effective in that language 
as in the Romance languages 


364 



PERFORMANCE TESTING IN PUBLIC 
PERSONNEL SELECTION 

PARI II 1 

SIDNEY W KORAN 

Jimployment Boaid, Pennsylvinn Depmtment of Public Assistance 

The Test foi Gi aphotype-Addressograph Opeuitois 

T HE POSITION of giaphotype-addiessograph operator 
occuis in each of the four regional financial offices of 
the Deparlmenl of Public Assistance Since these offices weie 
conveniently located about the State and weie equipped with 
a sufficient mimbei of graphotype and addressograph machines 
to insuie mpid and efficient conduct of the performance test, 
each of the foiu was used as an examination centei and the 
examinees were permitted to appeaL at the one of their choice 

To minimize difficulties likely to anse because some exam¬ 
inees might be unfamiliar with the paiticulni models on which 
the test was to be given, the notification foim sent to each 
examinee included the statement "The examination to which 
you have been assigned has been designed to test youi ability 
to operate the Class 6300 Giaphotype and the Class 2700 
Addressograph ” 

The test consisted of the following two parts and was set 
up and scored so that much the gieater emphasis was placed 
on Part I 

I Embossing names and addresses on Addieasograph 
plates with the Class 6300 Graphotype 
II Punting cards from the embossed plates with the 
Class 2700 Addiessograph 

Standardization of the proceduie was achieved by (1) 
having all tests admmisteied under the supervision of trained 

Wart I of lhi 9 article appeared m the July Issue of this Journal 

365 



KDUCAIIONAJ AND FSYCJlOLOGICAl MLASURF'MLNl 


individuals, (2) rcqimmg the cxamineis to repeat nil ducc- 
tions to the examinee vcibatnn fiom the copy, (3) provid¬ 
ing cacli examinee with a set of Distinctions setting forth 
the nature of the examination he was about to take and ex¬ 
actly what was expected of him, (4) using a stop watch fot 
all timing, and (5) caiefnl mechanical inspection of every 
machine at the beginning of each period of testing 

About 10 minutes bcfoie he was assigned to a machine 
each candidate was given a copy of the Instructions to E\ami 
nces (see Exhibit G) and Lold to lead them caiefully and 
to lefer to them as often as he wished, When his turn ai 
rived he was assigned to a Giaphotype, furnished with a file 
drawci containing 22 plates and 20 fiames, and given five 
minutes to familiarize himself with the machine and to prac¬ 
tice embossing two of the plates At the expnation of the 
piacticc period the examiner collected the Lwo piactice plates, 
furnished the examinee with the list of names and addiesses 
to be embossed, and icad aloud the appiopnate sections of 
the InstiucLions to Examine! (sec Exhibit H) At the expira¬ 
tion of 10 minutes the examinee was duected to stop emboss 
mg and to place the completed plates into fiames He was 
then assigned to an Addrcssograph ancl lcquned to print a 
card from each embossed plate, If the examinee was unable 
to operate the Adchessogiaph sufficiently well to punt a leg¬ 
ible copy from each plate, the examiner had an assistant punt 
the plates on a strip of papei so that a record of the candi¬ 
date’s perfoimancc on the Giaphotype would be available for 
scoring 

As with the Telephone Gpeiatoi test (descubed m Pait 
I of this aiticle), the sconng proceduic was designed to elimi¬ 
nate those whose perfoimancc fell below the standard estab¬ 
lished as the minimum acceptable, and to pioduce quantitative 
ratings reflecting the relative opciatmg ability of those who 
survived that elimination In establishing the qualifying 
point, consideiation was given to (1) the ieqimements of 
the job, (2) the calibic of the individuals employed and avail¬ 
able for employment, and (3) data on the agency’s previous 

366 



performance testing in public personnel 


experiences with a performance test for this type of position 

The procedure followed m scoring the plates of those 
who met this criterion took into consideration both the speed 
and the accuracy with which the plates had been embossed 
The examinee's taw score on the Giaphotype portion of the 
test was deteimined by subtracting his error scoie—computed 
by counting each deviation from the key 2 as one error—from 
the total number of strokes completed Examinees who dem¬ 
olish ated ability to opeiate the Addiessograph were given 
additional credit up to 10 pci cent of the total allowed for 
the complete performance test Keys were constructed which 
reduced the sconng task to a loutinc operation 

The Test foi Tabulating Machine Opeiatots 

The position of tabulating machine opeiator (IBM equip¬ 
ment) occurs only m the operating agency's State office In 
administering the test in cities othei than Harrisburg it was 
theiefoic necessaiy to anange to use the facilities of the 
IBM Seivict Bui can 

To discouiage individuals whose entire practical experi¬ 
ence had been confined to the opeiatron of sorters, numeric 
accounting machines, or Poweis equipment fiom reporting 
for the performance test, the following statement, intended as 
a rcmindei, was included m the notification form sent to all 
examinees “As pieviously indicated, the examination for this 
position has been designed to test your ability to operate both 
the IBM honzontal counting soiter and the IBM alphabetic 
•recounting machine " 

The test consisted of a two-part exercise In Part I the 
examinee was required to (1) wire the plugboard of the 
alphabetic accounting machine for listing and for totals, (2) 
use the horizontal sorter, and (3) adjust and operate the 
alphabetic .recounting machine so that ceitam alphabetic and 
numeric data fiom previously punched cards would be listed 
the same way as the data shown on the specimen form pro- 

s Exampies of deviations penalized were (1) incorrect letter or number, 
(2) spacing eiroi of any kind, including line space, (3) insertion of a letter 
or number, anti (+) omission of n letter or number 

367 



LDUCAI10NAI AND P&VCIIOLOGICAL MLASURkMLNT 


vided with the Instiuctions to Examinees (Exhibit I) p ait 
II of the exeicisc was an extension of Pait I in that it lequired 
the examinee to list specific data from the tabulating caids 
aftei (1) wiring the same plugboard to piovide foi numeric 
contiol and subtotals, (2) 1 c-soiting the punched caids, and 
(3) making seveial additional adjustments on the accounting 
machine 

About 10 minutes befoie the candidate was lequued to 
start the actual test, he was piovided with a copy of the In¬ 
ductions to Examinees and told to lead them caiefully and 
to lefei to them as oltcn as necessaiy throughout the exami¬ 
nation Attached to the sheet of Instiuctions weie a sample 
punched caid and a specimen sheet showing the foim in which 
the data weie to be listed by the alphabetic accounting ma¬ 
chine in Pait I of the cxciuse A sample caid and a poition 
of the specimen foim sheet aie lcpiaduced as Exhibits J and 
K lespcctively 

As soon as the examinee was icady to start the test he was 
piovided with a plughoaid, an adequate supply of the van- 
ous sues of wiics needed m making the connections, and, foi 
refciencc pmposes, a typc-bai layout and a plugboard dia- 
giam The examinee was then ienmided that the time limit 
for the entne test was one houi and forty-five minutes 

Elapsed time was recoided by means of an electnc job 
clock by stamping the staitmg time and finishing time of 
each opeiation on the examinee's job caid Since it was easiei 
to secure the use of plugboaids than tabulating and account¬ 
ing machines, the equipment at each centei included about 
three times as many plugboaids as it did pieces of mechanical 
equipment, As a lesuit of this it was sometimes necessaij 
foi an examinee to wait a few minutes foi his turn at an alpha¬ 
betic accounting machine At no Lime, howevei, was it found 
that he was lequiied to wait longci than 10 minutes, and 
this "lost time" was, of couise, automatically taken care of 
by the job clock timing method employed The small delays 
caused by having seveial times as many examinees wiling phig- 
boaids as could be accommodated at the machines weie so 


368 



PERFORMANCE 1ESTING IN PUBLIC PERSONNEL 


slight as to cause virtually no inconvenience to the candidates 
On the othei hand, the saving m time and expense which ie- 
suited fiom following this piocedurc was considerable, 

When the examinee had finished wnmg the plugboaid to 
his satisfaction, one of the examineis inspected it to make 
ceitain that no connections had been made which would be 
likely to damage the machine The examinee was then given 
a pack of 35 punched tabulating caids and duected to con¬ 
tinue with Pait I of the test No limit was placed on the 
uumbei of sheets of papei the examinee may have found it 
necessaiy to use noi on the niiinbei of times he was peimitted 
to make changes in the plugboaid wiling 01 machine adjust¬ 
ments He was Lold to wnte his identification number on 
each sheet and to write “final copy” on the one he wished to 
submit for scoung When Pait I had been completed, the 
examinee continued immediately with Pait II 

Duiing the test the examineis made no attempt to late the 
candidates on such points as the coriectness of their paiticu- 
lai appioach, the acceptability of then* work habits, noi, as 
already mentioned, the number of times they found it neces¬ 
sary to change Lhe wiling 01 leadjust the machine, The only 
factors taken into consideiation in sconng the test were (1) 
the accuiacy with which the assignments had been carried out, 
as shown by the finished pioducts, and (2) the length of tune 
consumed by the examinee in completing both parts of the 
exercise 

Reproduced as Exhibit L is a copy of the lating foim 
which has been marked to show the scores of a typical exam¬ 
inee The number in the parentheses after each item on the 
lating form is the maximum score obtainable for that item 
An examinee completing both paits of the exercise correctly 
within 45 minutes would receive 30 points for Part I, 30 
points foi Part II, and 40 points for finishing within the 
minimum time bracket, making a total score of 100 A func¬ 
tional bieakdown of the 60 points assigned to Parts I and 
II shows the following sorting, 10 points, location of alpha- 


369 



EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


betic and numcnc fields, 30 points, totals and subtotals, 20 
points, 

The Test foi Duplicating Machine Operators 
The position of duplicating machine opciatoi occurs m 
the State offices of the operating agency and the merit system 
agency Pei sons filling these positions nie requued to oper- 
atc several models of nurneogiaph and multilith machines As 
most of the woik involves the continuous use of the Class 
1200 multilith and the Model 100 mimeograph in the per¬ 
formance of a large vanety of duplicating jobs, and persons 
who can satisfactorily operate these models ordinarily have 
very little difficulty opeiating the older and less complicated 
types of equipment, the performance test was built around 
these particular machines, and the examinees weie so informed 
well m advance of the date of the examination 

A copy of the Instructions to Examinees (Exhibit M) 
was furnished each candidate at least 10 minutes before he 
was required to begin the test He was told to read these 
Instructions carefully and to keep them with him for refer- 
cnee throughout the examination 

The test foi mimeograph operators was administered first 
Each examinee was piovided with a mimeograph stencil into 
which a solid box of typevvirlten material measuimg 2^" by 
4)4" had been fleshly cut, IS 4" by 6" white file cards, and 
75 sheets of Icttei-size mimcogiaph papei on which a 3*4" 
by Syi" frame had been punted. The examinee was then 
referred to his printed instructions which directed him to (1) 
place the stencil on the cylinder, (2) adjust the machine and 
duplicate 25 cards so that the material which had been typed 
on the stencil was centered on each caul, (3) readjust the 
machine and duplicate 25 sheets so that the typed material 
was centered within the preprinted flame, and (4) remove 
the stencil and piepare it to be filed for future use The 
frame preprinted on the lcttei-si/c paper was located in a 
position which required maximum adjustment of the machine's 
margin guides before the typewritten matciial on the stencil 
could be made to punt within the required borders 


370 



PERFORMANCE TLS1ING IN rUBLIC PERSONNEL 


When he was ready for the multihth poition of the test, 
the candidate was provided with a photographic multex plate 
containing typewiitten material measunng 6" by 8", 75 sheets 
of lettei-size bond papei on which a 6 1 / 2 ' t by 8Yi” frame 
had been printed, and a supply of platex, keepeze, blankrola, 
lepelex, and absorbent cotton He was then refeued to his 
printed instmctions which directed him to (1) apply the pla- 
tcx, (2) put the plate on the machine, (3) adjust the machine 
and duplicate 25 sheets so that the typewritten materral on 
the plate was centered within the preprinted frame, (4) re¬ 
move the plate and prepare it to be filed foi future use, and 
(5) clean the blanket Two forms of the multihth plate were 
used alternately These forms differed from each other only 
in the location of Lhe typewritten matenal on the plate and 
were designed so that, while all candidates weie required to 
make exactly the same kind ol adjustments to center the mate¬ 
rial properly, the examincis were leheved of the necessity of 
setting up the machine after each run 

The examineis obseived the candidates from a reasonable 
distance throughout the test in order to complete the latlng 
form shown as Exhibit N Timing was accomplished with a 
stop watch, and considerable use was made of the lemaiks 
column of the rating sheet to lecoid all occurrences likely to 
contribute towaid a fair evaluation of the examinee’s per¬ 
formance Whenever necessary, candidates were told how to 
turn on the particular model of equipment on winch the test 
was given Certain other bits of information, such as the use 
of the ink rolls on the multihth and the side margin adjust¬ 
ments on both the multihth and the mimeograph were also 
given, but careful note was made of the circumstances rn each 
case so that the stipulated penalties could later be subtracted 
The half hour time limit on each machine was mentioned 
111 the Instructions to Examinees but was not particularly em¬ 
phasised, As the candidate started each part of the test the 
examiner said, “You will be allowed up to 30 minutes to 
complete this part of the test The tune you actually con¬ 
sume will enter into the computation of your score, but you 

371 



LDUCA.ll ON AT AND 1‘SYCIIOI OG1CA1 MEASUREMENT 


ought not to woik so rapidly (.hat the quality of your work 
suffeis ” Veiy few candidates icqmicd moic than 15 minutes 
to complete the mimeograph test 01 more than 20 minutes 
for the multihth test, and most of those lequmng moie than 
this time weie examinees who either got oil to a bad stait or 
weic obviously so unfamiliar with the equipment that they 
weie simply pcLSevenng while hoping foi a miracle to occur 
Persons in the latlei gioup were encouiagcd to continue as 
long as they did not endangei the equipment The additional 
time required to do this was cheerfully chaiged to public 
relations when iL was discovered that, instead of rationalizing 
that if they had been given moie time they would have suc¬ 
ceeded, these candidates eventually insisted on withdrawing 
of their own accord and almost mvanably thanked the 
examiners foi “being so patient with me and giving me 
every bicak “ 

Scoring was accomplished by determining the number of 
points earned by the examinee foi coLrectly accomplishing 
each of Lhe items hsLed in the schedule of cl edits (Exhibit 
0), and then entering the appiopuatc amounts in the spaces 
provided on the summary sheet (Exhibit P) As finally 
worked out, the schedule of credits provided a weight of 60 
foi the multihth portion of the test, and 40 foi the mimeo¬ 
graph 

In establishing the number ol credits to be allowed foi the 
successful accomplishment of each “item 11 of the test, con¬ 
sideration was given to the relative difficulty of the pniticulai 
function undcL consideration Thus, in scoung the mimeograph 
poitron of the test, twice as much ciedit (8 points) was 
gvanted when the examinee’s finished product presented evi¬ 
dence of collect side-margin adjustments as when the vertical 
mat gins weie satisfactory (4 points) For the multihth, on 
the othei hand, moic credit (10 points) was given for piopei 
adjustment of the side maigins than foi having the vertical 
margins collect (6 points) 


372 



PERFORMANCE L IDSliNCi IN PUBLIC PERSONNEL 


Exhibit G 

COMMONWEALTH OF PENNSYLVANIA 
EMPLOYMENT BOARD 
OF THE 

DEPARTMENT OF PUBLIC ASSISTANCE 

Harrisburg 

Performance Tpst — Series 1900 

GrAEHOTYPE AND AdDRLSSOGRAPH MACHINE OPERATORS 

July 1940 

INSTRUCTIONS TO EXAMINEES 

Important Failure to follow instructions may 
result in disqualification fiom the examination 

Study these instructions carefully When you me ready to begin 
the examination, signal the Examiner He will assign vou to a machine 
and furnish you with the matenal with which you are to work 
Graphotype Machine 

The examination for this machine consists of embossing a number 
of names and addresses in accordance with the form shown m the 
attached sample 

As soon as you have been assigned to a machine, the Examiner will 
furnish you with a file drawn containing 22 plates and 20 frames . You 
will be given 5 minutes to familiarize youiself with the machine during 
which time you may use 2, and only 2, of the plates to practice 
embossing. 

At the conclusion of the practice period the Examiner will collect 
the 2 practice plates and give you a mimeogiaphed list of names and 
addresses which you aic to begin embossing as soon as he gives the signal 
to "Start ” 

Continue embossing until the Exammei calls "Time ” Do not put 
the plates itt the frames as they ate embossed, you will be required to 
do that later 

“Time 11 will be called at the end of exactly 10 minutes, 

The list of names to be embossed has purposely been made longer 
than even the fastest opeiators are likety to be able to complete If the 
Examiner calls "Time” while vou are in the midst of embossing a plate, 
you may remove the unfinished plate from the machine, but you must 
not continue to emboss it 
Inseifing Plates in Frames 

As soon as the Examinei tells you to do so, place each embossed 
plate m the lower part of a frame and arrange all frames in the file 
drawer so that they will be ready to jun through the Addressograph 

When you have finished this task, the Examiner will provide you 

372 



LDUCAIICJNAI, AND PSYCHOLOGICAL MEASUREMENT 


with an envelope containing twenty 4" by 6" cards (on which your 
Identification Number has been punted) and assign you to an Addressa- 
graph 

Addressograph Machine 

The examination for this machine consists of printing from each 
plate that you have embossed 

The machine will be set to punt consecutively, and you will be 
required to make ,i single mipicssion on each 4" hy 6" caul in the 
position shown on the attached sample, 

PLACE YOUR APPOINTMENT SLIP, THE INSPECTION 
SHEET, AND THE TWENTY CARDS INTO THE 
LARGE ENVELOPE AND SEAL IT 


374 



PERFORMANCE IESIINCJ IN PUBLIC PERSONNEL 


Exhibit H 

COMMONWEALTH OF PENNSYLVANIA 
EMPLOYMENT BOARD 
OF THE 

DEPARTMENT OF PUBLIC ASSISTANCE 

Harrisburg 

Performance Test 

Graphotypi- and Addressograpii Machine Operators 

Series 1900 
July 1940 

INSTRUCTIONS 10 EXAMINER 

1 Read the INSTRUCTIONS TO EXAMINEES and become 
entirely familiar with their contents before attempting to administer 
the examination 

2 Examinees will be scheduled at the rate of three or four per houi 
where one set of machines is available, and at the rate of six oi 
eight per hour where two sets arc available 

3 Do not admit anyone without an Admittance Slip unless he can 
establish his identity as an examinee who has qualified for the 
machine test 

4 Provision should be made for the examinee to be comfortably seated 
away from the scene of the examination while he is awaiting his 
turn to operate the machines, 

5 About 10 minutes before the examinee is assigned to a machine, 
talcc his fingeipunt, hand him a copy of the mimeographed IN¬ 
STRUCTIONS, and tell him to read them caiefuliy If he asks 
any questions, you may answei them> but it should not be necessary 
to furnish anv information beyond that already appearing in the 
INSTRUCTIONS 

6 When the examinee is ready to begin the examination and a 
Graphotvpe machine becomes available., assign him to the machine 
and furnish lum with a file drawer containing 22 plates and 20 
frames (all in perfect condition), 

7 Tell the examinee he may have five minutes to familiarize himself 
with the machine and may practice embossing on two of the plates 

8 At the expiration of five minutes (or befoie, if the examinee says 
he is ready to begin) collect the two practice plates and hand him 
the list of names to be embossed Then say 

“DO NOT BEGIN UNTIL I GrVE THE SIGNAL 
EMBOSS EACH NAME AND ADDRESS ON A SEP- 

375 



EDUCAl ION AI. AND PSYCIIOLUGICAL MEASUREMENT 

ARATE PLATE USING THE SAME FORM AS IN 
TIIE SAMPLE ATTACHED TO YOUR INSTRUC¬ 
TIONS COPY TIIE NAMES AND ADDRESSES 
EXACTLY AS THEY APPEAR, OMITTING All 
PUNCTUATION AT THE END OF 10 MINUTES , 
WHEN I CALL TIME/ YOU MUST STOP EMBOSS¬ 
ING M 

9 Say to the examinee, ‘'READY, START'* 

10 Permit the examinee to continue exactly 10 minutes, Then say, 

"TIME STOP EMBOSSING 11 

11 If a plate is m the machine, pcimit him to remove it Take away 
the list of names and all blank plates Then say 

"PLACE EACH EMBOSSED AND PARTIALLY EM¬ 
BOSSED PLATE INTO THE LOWER PART OF A 
FRAME AND ARRANGE THE FRAMES IN THE 
FILE DRAWER SO THAT THEY CAN BE RUN 
THROUGH THE ADDRESSOGRAPII" 

12 When the examinee has placed each plate m a flame, take away 
the unused frames, hand him Ins envelope containing the cards, and 
assign him to an Addrcssogiaph Then sav 

"PRINT TIIE CARDS FROM THE PLATES MAKE 
ONLY ONE IMPRESSION ON A CARD AND IN 
APPROXIMATELY THE SAME POSITION AS 
SHOWN ON THE SAMPLE ATTACHED TO YOUR 
INSTRUCTIONS THE ADDRESSOGRAPII HAS 
BEEN SET TO PRINT CONSECUTIVELY GO 
AHEAD" 

13 When the examinee has made an nnpicssion from each plate, tell 
him to place the 20 cauls (punted and imprinted) into the envelope 
together with his Admittance Shf> and lush actions and seal the 
envelope 

Note If ilic examinee is unable lo operate ihc Addressograph sufficiently well 

(o print a legible copy from each phte he lias embossed, Imve the phtes printed 

on n strip of paper so that a record of his performance on the Gnphotype will 

be available for scoring 

14 If at any stage of the machine opci.ition you and the Addicsso 
graph lepresentative ,uc convinced that the examinee docs not 
possess sufficient knowledge of the operation of either machine to 
continue with safety to himself and without damage to the equip¬ 
ment, the test may be halted If this becomes ncccssniy, a full state¬ 
ment of the cncumstanccs must be written on the back of the 
Admittance Slip and signed bv both you and the icpresentativc 
In any instance m which the plates themselves will be valuable as 
possible exhibits, they should be enclosed in the examinee's envelope 
before sealing 


376 



PERFOIlMANCi 'IESTING IN PUBLIC PERSONNEL 


Exhibit I 

COMMONWEALTH OF PENNSYLVANIA 
EMPLOYMENT BOARD 
OF THE 

DEPARTMENT OF PUBLIC ASSISTANCE 

Harrisburg 

PLRrORRfANcii Tesi — Serils 2100 
Senior Tabulating Machine Opera roR 
Decembei 1940 

INSTRUCTIONS TO EXAMINEES 
Important Failuie to follow instructions may 
result in disqualification from the examination 

Study these instructions carefully As soon as a machine becomes 
available, the Examiner will give you furtliei instructions and furnish 
you with all necessary material 

The peifoimance test for tilts position will consist of a twopait 
exercise designed to determine your ability to opeiate the IBM Hori¬ 
zontal Counting Soi tei and IBM Alphabetic Accounting Mnchine The 
time limit foi the entire test is 1 hour and 45 minutes 

You will be given 35 tabulating cards (into which various data 
have been punched) and a plugboaid foi the Alphabetic Accounting 
Machine Note that the model ot the Accounting Machine being used 
has 32 counters and 55 type bars of which numbers 19 to 43 are 
alphabetic, 

The following operations should be carried out in the order indi 
cated 

PART I 

1 Wire the plugboard so that the machine will list the following 
information exactly as shown on the attached sample 

Card Number 

Name (last, first initial, middle initial) 

Social Security Number 
Total Benefits Paid 
Weekly Benefit Amount 
Reason 

Note In addition to listing the data, the machine is to be 
wired to show tot*ds at the end of each of the following fields 
Total Benefits Paid (allow for six digits) 

Weekly Benefit Amount (allow for six digits) 

2 Sort the cards in "Card Number’' order, 

3 Write your Identification Number in the space provided on 
Form I and list the data from the cauls to conform to the 
sample and the above instructions 

377 



rDUCAriONAI and psyciioiogicai measurement 


PART II 

1 Wire the plugboard to conti ol on “Reason " 

2 Sort the cm (Is by “Reason," disregarding “Card Number *' 

3 Set automatic hanimcrlotk control to eliminate the listing of 
names, 

4 On .mother copy of Fmm I write youi Identification Number 
in the space piovulcd and list the data fiom the cards, single 
spaced, to show subtotals for each reason (foi Total Benefits 
Paid and Weekly Benefit Amount), 

PLACE YOUR ADMITTANCE SLIP, THIS INSTRUCTION 
SHEET, THE PUNCHED CARDS, AND BOTH COPIES 
OF FORM I IN THE MANILA ENVELOPE AND SEAL IT 


378 



PERFORMANCE 1ES1ING IN PUBLIC PERSONNEL 







































































LDPCA1I0NA1 AND PSVCHOI OGILAL MLASUREML.Nl 


Exhibit I, 

PnurowsrANcr Trsr—T abulating Machinl Opfrators 

Si uirs 2100 


D 24356 

Allegheny 

Idcnt Number 

T cpal Cminly 



File Number 


Withdrew fioni E\ainiiiaitan (0) 

PART I 

Sailing by Omd Nnmbei (5) 

Location of Fields (15) 

Caul Number (1) 

Name (2) 

Social Sccimty Numbci (2) 

Total benefits Paul (2) 

Weekly Benefit Amount (2) 

Reason (2) 

Totals 

Benefits Paid Column (2) 
Benefit Amount Column (2) 

<1 ran acy of Totals (10) 

Benefits Paid Column (5) 

Benefit Amount Column (5) 

PART II 

Soiling by Reason (5) 

Location of L teids (15) 

Card Numbci (1) 

Name eliminated (2) 

Social Security Numbci (2) 

Total Benefits Paul (2) 

Weekly Benefit Amount (2) 

Reason (2) 

Subtotals 

Benefits Paul Column (2) 
Benefit Amount Column (2) 
Accutacy of Subtotals (10) 

Benefits Paid Column (5) 

Benefit Amount Column (5) 

Time (40) 

45 minutes or less (40)_ 

46 to 60 minutes (30) 48 mm 

61 to 75 minutes (20)_ 

76 to 90 minutes (10)_ 

91 to 105 minutes (0),_ 


Scored by_ 

(Form EB-742) 


SWK 


TOTAL RAW SCORE 
Checked by_ 


380 





PI'Rl'ORMANCfr lULlhG IN FU1JJIC PERSONNEL 

Exhibit M 

COMMONWEALTH OF PENNSYLVANIA 
EMPLOYMENT BOARD 
OF THE 

DEPARTMENT OF PUBLIC ASSISTANCE 

Hamsburg 
PERrORMANCI TbSI 
Duplicating Machine Operators 
Series 2800 
November 1940 

INSTRUCTIONS TO EXAMINEES 

Important Failure to follow instructions may 
icsult in disqualification fiom the examination 

Study these mstiuctions carefully As soon as a machine becomes 
available, the Examiner will give you further instructions and furnish 
you with all necessary material 

Mimeograph Machine Time limit, 30 minutes The Examiner 
will furnish you with the following material 
f newly cut mimeograph stencil 
75 4" by 6" cards 

75 sheets of prc-printcd SJ4" by 11" mimeogiaph paper 
(sample attached) 

The examination for this machine will consist of (1) putting the 
stencil on the cyhndei, (2) adjusting the machine and duplicating 25 
cards so that the matcnal which has been typed on the stencil is cen¬ 
tered on each card, (3) readjusting the machine and duplicating 25 
sheets so that the typed matcnal is centered within the pre-pnnted box, 
and (4) removing the stencil and pieparing it to be filed for futuic 
use (You may use as many cards and sheets of paper as necessary in 
setting up the machine, but do not waste any All matcnal will be 
considered in determining your score m the examination ) 

WHEN YOU HAVE COMPLETED THIS PORTION 
OF THE TEST, PLACE THE STENCIL AND ALL 
USED CARDS AND PAPER INTO THE LARGE 
MANILA ENVELOPE 

Muhihth Machine Time limit, 30 minutes The Examiner will 
furnish you with the following material 
1 photogiaphic rnultex plate 

75 sheets of pre-printed by 11" bond papei (sample 

attached) 

Supply of Platex, Kecpcze, Blankrola, Repelex, and absorbent 
cotton 


381 



LDUCATIONAL ANJ) PSYCHOLOGICAL MLASURLMLNl 


The examination foi tins machine will consist of (1) applying 
Platex, (2) putting the plate nn the machine, (3) adjusting the machine 
nnd duplicating 25 sheets so that the typed matciial is centered within 
the pre-punted box, (4) icmoving the plate anil preparing it to be filed 
for future use, and (5) cleaning the blanket (You may use as many 
cards and sheets of papei as necessary in setting up the machine, but do 
not waste any All matciial will he consitleicd m cleteimining your score 
in the examination ) 

WHEN YOU HAVE COMPLETED THIS PORTION 
OF TIIE TEST, PLACE YOUR ADMITTANCE SLIP, 
THIS INSTRUCTION SHEET, AND ALL USED 
PAPER INTO THE LARGE MANILA ENVELOPE 
AND SEAL IT. 



PERFORMANCE TESTING IN PUBLIC PERSONNEL 


Exhibit N 

COMMONWEALTH OF PENNSYLVANIA 
EMPLOYMENT BOARD 
OF THE 

DEPARTMENT OF PUBLIC ASSISTANCE 

Harrisburg 

Performance Tist for Duplicating Machine Operators 

Series 2800 


November 1940 


examiner's rating sheet 
Mimeograph Machine 


Examinee's Id No 


OPERATION 

Card 

Paper 

Remarks 

Place stencil on cylinder 


X 


Adjust piper feed _ 




Make side mirgnt adjustment 




Make vcitlcal margin adjustment 


i 


Use of print recorder 


1 


Run copies 


1 


like off stencil 

X 



Clean stencil 

X 

. 




1 


TIME START 

STOP | ELAPSED 


MiiUiUth Machine 


OPERATION 

Papei 

Remarks 

PIsiCcx plnle 



Put on plnte 



C 

£ 
i—i, 



Puil proof 



Locate foim in proper position 



Clean blanket 



Set counter i 



Run copies 



Clean plate 



Tqku olf plate 



Koepeze 



Clean blanket 






TIME START 

| STOP ELAPSED 


Date. 


Examiner. 

Examiiier. 

383 






















LDUCA1I0NAL AND PSYCHOLOGICAL ML ASURRMENl 


Exhibit o 

PROCEDURE FOR SCORING DUPLICATING MACHINE 
OPERATOR PERFORMANCE TEST 

SERIES 2800 

SeilTDUJL or CRl DIIS 

Mote' The c\nmmcc's latnig foi cadi item is to be 
determined m nccoidance with llie following schedule 
and cntcicd in the 'ippmpi utc space on the scoring 
sheet (Foim EB-760) All latings and totals must 
be checked by a second scoicr 

Muneotjiaph (40) 

Tune (16) 

1 to 10 minutes— 16 points 
11 to 15 minutes— 12 points 
16 to 20 minutes— 8 points 
21 to 25 minutes— 4 points 
26 to 30 minutes — no cicdiL 

Stencil (2) 

Removed and cleaned piopcily (ucdit if cheeked on lating sheet) 
—2 points 

4" by 6" cauls (11) 

Piactlcc caids (not more than 15) —2 points 
Number of copies (25 to 30) —2 points 
Use of counter — 1 point 

Vertical margin adjustment (80% of final copies with at least 
maigm top and bottom) — 2 points 
Side maigms (80% of final copies with at least yi ,f margin each 
side) —4 points 

Sy* 1 by 11" papei (11) 

Piactice sheets (notmoic than 15) —2 points 
Numbci of sheets (25 to 30) — 2 points 
Use of countei —-1 point 

Vertical maigms (80% of final copies not touching horizontal 
lines) — 2 points 

Side maigms (80% of final copies not touching vertical lines) —4 
points 

Penally (-4) 

No 1—If examinee was given information on side margin adjust¬ 
ment— subfiact 4 points (but only when examinee has lcceived 
credit for coircct side margin adjustment) 

Mult iltth (60) 

384 



PliKFOHMANCI* Jl-Sl'lNG IN PUBI1C FFRfaONNLI 


Tune (24) 

1 to JO minutes — 24 points 
11 to 15 minutes— J8 points 
16 to 20 minutes— 12 points 
21 to 25 minutes— 6 points 
26 to 30 minutes — no credit 
Glean 1>late (cicdit if cheeked on rating sheet) —-5 points 
Clean blanket (cicdit only if second listing is checked on iatiiig sheet) 
— 2 points 

&*/ 2 " by II" paper (29) 

Practice sheets (not more than 15) —5 points 
Number of copies (25 to 28) — 3 points 
Use of countei —2 points 

Side margins (aL least 3/16" on cadi side) —6 points 
Vertical margins (at least 3/16" top and bottom) — 10 points 
Inking (3) 

Evenness — 2 points 
Blackness—1 point 
Penalties (-10) 

No 2—If examinee was given information on use of ink rolls — 
subtract 5 points 

No 3—If examinee was given information on side margin adjust¬ 
ment— subtract 3 points (but only when examinee lias re¬ 
ceived credit fot coirect side margin adjustment) 

No d—If examinee left ( in ink roll in contact with plate cylindci— 
subtract 2 points 


385 



EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 

Exhibit P 

PERFORMANCE TEST—DUPLICATING MACHINE 
OPERATORS—SERIES 2800 



Operation o[ Mimeogt aph (40) 

Withdrew or was stopped by Examiners (F) . 

Tune (16) ... 

Stencil (2)... 

4" by 6" Cards (11) 

Practice caids (2). 

Number of copies (3). 

Veitical margin adjustment (2) . 

Side margins (4). 

8'A" by IV Papet (11) 

Piactice sheets (2).. , 

Number of sheets (3) . 

Vertical margin adjustment (2). 

Side margins (4)... 


Operation of Multihth (60) 

Withdrew or was stopped bv Examiners (F) 

Time (24) .'. 

Clean Elate (5). 

Clean Blanket (2). 

8y 2 " by IE' Paper (29) 

Practice sheets (5). 

Number of copies (5). 

Vertical margin adjustment (10) . . 
Side margins (6) . . ... , 

Inking (3) .. 


Penalties: No, 1 


Scored by_ 

(Form EB-760) 


No. 4_ Subtract- 

Total Raw Score 


Checked by_ 


386 
























THE VALUE OF INTELLIGENCE QUOTIENTS 
OBTAINED IN SECONDARY SCHOOL 
FOR PREDICTING COLLEGE SCHOLARSHIP 

L. D. HARTSON 
and 

A. J. SPROW 
Oberlin College 

I N SELECTIVE admission to college, and particularly in 
the award of scholarships, it is the practice to request a 
report of scores made by the candidates in intelligence tests. 
This study reports (1) the relative value of these different 
tests for predicting (a) total high school scholarship, 1 (b) 
college freshman scholarship, and (c) seven-semester college 
scholarship; (2) the comparative validity of these tests and 
the Ohio State University Psychological Examination; (3) the 
average I.Q. of the student body, as determined by these 
various instruments; (4) comparison of the I.Q.’s of the 
Oberlin group with those of the Tertnati*MerriU standardiza¬ 
tion group; (5) average freshman scholarship for students of 
the different I.Q. levels. 

A total of 835 freshmen entered the College of Arts and 
Sciences, Oberlin College, during the period, 1934 to 1940, 
for whom I.Q.’s were available, which could be identified with 
specific forms of tests, in groups large enough to warrant 
statistical treatment. In six cases there were two scores, making 
the total of 841 in Table 1. Of these, 253 had progressed as 
far as their eighth semester. For these, the computations are 
based upon the scholarship record for seven semesters (those 

lThe figures used in the computations of high school scholarship represent, 
not the actual grades, but “credit points" obtained by a system used to equate 
different grading schemes. 


387 



EPUUA I'lONAL AND I\SYCIIOLOGICAL MEASUREMENT 


W 

PQ 


Eh 



CJ i-H 1—I CO CO 

H □ [S t|- ■+ O 


Vfl M ifl w H ^ 

CD O I lv„ o. 


OO VO W C'-l o C\1 

1—1 »-i T-l 1—1 CM 


*-0 ■*+ Cn VO Q c\ 

oo CD to CD (SI o 

06 o o c<* vd 


*4" CD l'*- VO c<% 

CD i—4 -4- DO in -4- 

'-O vo >A »■+• —f N 


■ftnONM-f 

vo o do m r-s i>. 
n i* M h *f 


M H « M N in 

(v W Ch H *f vo 

tn rvj m c<r c*l to 


; nj ‘ &, 

S? ’| <j & 

i rd i co 

K 2 W O 



388 



VALUE OP I.Q.’s FOR PREDICTING COLLEGE SCHOLARSHIP 

used for the award of Phi Beta Kappa). In order to make 
the scholastic records equivalent for the several classes of 
varying size, scholarship is handled in terms of proportional 
class rank. Because the reports did not, in all cases, specify 
the particular variety of Otis test employed, all the Otis scores 
have been grouped. All the I Q.’s here considered were de¬ 
rived from group tests. 

Re sails 

Table 1 repoits the coefficients of correlations (1) between 
the I.Q.’s on the Otis, Terman, Henmon-Nelson, National, 
and Kuhlmann-Anderson Tests, and scores on the Ohio State 
University Psychological Examination administered before 
matriculation in college as one variable, and high school 
scholarship; (2) between the above-named tests and first 
semester college scholarship; (3) between I.Q.’s on one or 
another of the first five tests and scores on the OSU test, 
administered during freshman week, with the means and 
sigmas. In the case of the National and of the Kuhlmann- 
Anderson tests the N is rather small, and all the data, there¬ 
fore, are less reliable than those obtained with the other tests. 
To obtain a basis for comparing the validities of the OSU test 
and each of the others, Table 2 reports, for each of the test 

TABLE 2 

CORRELATIONS BETWEEN THE OSU TEST SCORES AND SCHOLARSHIP OF 
THE GROUPS TESTED WITH THE OTIS, TERMAN, HENMON-NELSON, 
NATIONAL, KUHLMANN-ANDERSON AND OSU TEST (PRE-ENTRANCE 


group) WITH MEANS AND SIGMAS 


Test Group 

N 

Scholarship 

High Sch Freshman 

Means 


Otis . 

444 

394 

579 

49.35 

28.95 

Terman. 

221 

.337 

.550 

51.47 

28.00 

Henmon-Nelson, 

110 

.458 

.604 

48.59 

29.22 

National . 

38 

.473 

.631 

54.18 

25.36 

Kuhlmann- 

Anderson 

28 

.633 

.564 

51.93 

26.63 

OSU Test 
(pre-entrance) 

258 

.510 

.629 

48.72 

28.72 


389 





EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 

groups, the correlation between scores in the OSU test and 
(a) high school scholarship and (b) first semester college 
scholarship, with means and sigmas. 

1. Relative Validities of the I.Q. Tests. A comparison 
of the validities of the different tests yielding I.Q.’s indicates 
that the Henmon-Nelson test gets first place. However, the 
relative deviate of the difference (k) between the correlation 
of Henmon-Nelson I.Q.’s and college scholarship (.480) and 
the corresponding coefficient for the Otis test (.364) is 
but 1.20. 

2. The Prediction of High School and College Scholar¬ 
ship. The coefficients indicating the relationship between I.Q. 
and high school scholarship range between .212 and .396. 
With the exception of the Kuhlmann-Anderson test (for 
which there are but 28 cases) the I.Q.’s constitute a better 
basis for predicting college scholarship than they do high 
school grades When validated against college scholarship, 
the coefficients range between .287 (omitting Kuhlmann- 
Andetson) and .480 Byrns and Henmon, who used I.Q.’s 
obtained from National Intelligence Tests, administered in the 
4th to 8th grades, also found a closer correlation between 
I.Q. and first semester college scholarship [A 54) than be¬ 
tween I.Q. and total high school scholarship (.426) (2) 
Higher validities for the college criterion were also obtained 
for both of the OSU examinations. With the pre-entiance 
OSU Test the coefficients arc .365 and .474, and with the 
Fieshman Week test, they are .510 and .629, for high school 
grades and college scholarship, respectively. These results 
substantiate previous findings at Oberlin. For the 511 men 
and 609 women who entered as freshmen during the period, 
1931 to 1934, the correlation between college scholarship and 
OSU Test intelligence is represented by coefficients of .605 and 
.574, for the men and the women, respectively, whereas the 
correlation between test intelligence and high school scholar¬ 
ship is represented by coefficients of .398 and .380 (3). It 
will be noted that the OSU Test scores have higher validity 
than does the I.Q., as indicated by the coefficients obtained 


390 



VALUE OF I.Q.’S FOR PREDICTING COLLEGE SCHOLARSHIP 

TABLE 3 

COMPARATIVE VALIDITIES OF THE OSU TEST AND THE I. Q. TESTS FOR 
PREDICTING HIGH SCHOOL AND COLLEGE SCHOLARSHIP 


Test Gioup High School Scholai ship College Scholarship 



OSU Test 

I.Q. 

OSU Test 

I.Q 

Otis. 

. .394 

.322 

.579 

.364 

Terman. 

. .337 

.281 

.550 

.403 

Henmon-Nelson, , . . 

. 458 

396 

.604 

.480 

National . 

.473 

.212 

.631 

.287 

Kith 1 man n-Anderson 

.633 

.247 

.564 

.178 


when correlations were computed for each of the I.Q. popula¬ 
tions between the scores made with the OSU Test and the two 
scholarship criteria (Table 3) 

The OSU Test is designedly a more difficult one than the 
other tests. Although some tests have as many items as the 
OSU Test, none requires as much time. All of the I.Q. tests 
are time limited, maximum time being 30 minutes, whereas 
the OSU Test was administered by work-limit method, stu¬ 
dents usually taking at least two hours. 

3 Comparative Validity o/ Pre-entrance and Freshman 
Week OSU Test Scoies. That the higher coefficients obtained 
for the OSU Test may not be due entirely to its greater 
difficulty, however, is suggested by a comparison of the co¬ 
efficients obtained for the OSU Test under two sets of condi¬ 
tions. There were 258 students who took the OSU Test some 
time before entering college who were re-examined with this 
test during their Freshman Week. (In some instances the 
same form of the test was used, but usually it was another 
form ) The Freshman Week test yielded substantially higher 
validity figures with both criteria. .510 as compared with .365 
for high school scholarship, and .629 as compared with 474 
for college freshman scholarship These coefficients, with the 
means and sigmas, are reported in Tables 1 and 2. It will 
be noted that the group given the OSU (pre-entrance) Test 
had distinctly lower scholastic records than the others and 
that they also displayed greater variability. This is to be 

391 




EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


explained by the fact that the great bulk of these students 
were given the test because their high school record made their 
admission questionable. On the other hand, the group con¬ 
tained some who, being of exceptionally high caliber, were 
applying for scholarships. In most cases, it may be presumed, 
the OSU Test was administered under conditions of greater 
motivation than prevailed with the I.Q. tests. 

4. Inieico) relations Betzvecn the Test Scores. The inter- 
corielatious between the scores made on the OSU Test and the 
different forms of I.Q. range from .456 to .610; the sequential 
order from the higher to lower coefficients being: Terman, 
Henmon-Nelson, Otis, National, and Kuhlmami-Anderson. 

5. Validation of Tests Against Total College Scholarship. 
There are 253 students for whom scholastic grades are avail¬ 
able for seven semesters of the college course. Because the 
numbers were too small to warrant separate computations for 
each of the tests, all of the I.Q.’s were combined and the 
validity coefficients computed, using both the one- and the 
seven-semester criterion. Validity coefficients were obtained 
for the OSU Test for the same population. These are given 
in Table 4. The two validity figures for the I.Q.’s are .341 
and .319, and for the OSU Test scores the figures are .501 
and .438. Although scores on the OSU Test are more valid 
bases for prognosing college grades than is the I.Q. at both 
levels, their superiority for predicting total college scholarship 
is less than when used for predicting freshman grades. From 
Table 4, one may also note that, as in the computations re- 

TABLE 4 

CORRELATIONS BETWEEN I. Q.’s AND OSU TEST SCORES AND (1) HIGH 
SCHOOL SCHOLARSHIP, AND SCHOLARSHIP FOR (2) ONE AND (3) SEVEN 
SEMESTERS; WITH MEANS AND SIGMAS 


Test Score N High Sell. 1 Sem. 7 Sems. Mean Sigma 

1. Q. 253 471 7341 319 12T24 10700 

OSU Test. 253 .307 .501 .438 50.95 29.85 

Mean . 75.82 5337 4736 

Sigma . 17.19 25.68 29.00 


392 






VALUE OF I.Q.’s FOR PREDICTING COLLEGE SCHOLARSHIP 

ported for the other populations, the I.Q. (and OSU Test 
score) shows a closer relationship for college freshman 
scholarship than for high school scholarship 

6. Avetage I.Q. of the Oberlin Student Body. The mean 
I.Q. of the 835 freshmen, as measured by these group tests, 
is 121 06. Theie is substantial agreement on this point be¬ 
tween the Otis, Terman, and Henmon-Nelson tests (see 
Table 1). The cases measured by the National and Kuhlmann- 
Anderson tests, for which the means are 127.21 and 124.96 
respectively, are too few to influence the general average 
materially. The average for the group of 253 who attained 
senior status is 121.24, thus indicating that practically no 
selection occurred between the freshman and the senior year 
in terms of I.Q. This is corroborated by the OSU Test stand¬ 
ing of the freshman and senior groups. In terms of local 
freshman norms, the mean score of the 835 students is 49.97, 
thus indicating that they are an almost completely perfect 
sample of the Oberlin first-year population. The mean score 
of the 253 who became seniors is 50.95. The seniors do, how¬ 
ever, constitute a somewhat selected group in terms of college 
scholarship. This is indicated by the fact that, whereas the 
mean freshman scholarship of the entire group is represented 
by a proportional rank of 49.55, the mean freshman rating of 
those who persisted until they reached senior status is 53.37— 
the larger figure represents higher scholarship status—and 
the mean scholarship of the 588 who had not become seniors 
is 47.91. The critical ratio of the difference between the fresh¬ 
man scholarship of those who became seniors and those who 
did not is 2.73. 

7. Comparison of Oberlin Students with the Terman- 
Merrill Standardization Group. Figure 1 presents a graphic 
comparison of the Oberlin students with the normal group of 
2904 used in the standardization of the Terman-Merrill Binet 
test (4, p. 37). The numbers and proportions of the Oberlin 
population at the different I.Q. levels are reported in Table 5. 
The I.Q.’s of the Oberlin group range from 92 to 169, 99 per 

393 



EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


TABLE 5 


FRESHMAN AND SENIOR SCHOLARSHIP OF STUDENTS OF 
DIFFERENT I. Q. LEVELS 


I Q. 

N 

Entrants 

Scholarship 
c /o Mean Range 

N pos¬ 
sible 

N ac¬ 
tual 

Seniors 

% 

Scholarship 
Mean Range 

166-170 

3 

0.4 

97 7 

97-99 

3 

2 

66.7 

96.0 

93-99 

161-165 

0 




0 





156-160 

0 




0 





151-155 

1 

0.1 

94.0 

94 

1 

1 

100,0 

87 0 

87 

146-150 

13 

1 6 

72 9 

13-95 

5 

3 

60.0 

63,3 

30-90 

141-145 

14 

1 7 

72.1 

26-97 

7 

5 

71 4 

66.0 

12-84 

136-140 

44 

5 3 

70 9 

5-98 

16 

13 

81.3 

51.2 

1-99 

131-135 

57 

6.8 

59.8 

4-99 

18 

14 

77 8 

49,6 

13-99 

126-130 

no 

13.2 

58.4 

1-99 

43 

36 

83 7 

57.4 

1-99 

121-125 

178 

21.3 

52.1 

1-99 

74 

57 

77.0 

53.3 

5-100 

116-120 

179 

21.4 

45 5 

1-98 

71 

52 

73 2 

42.4 

2-91 

111-115 

119 

142 

41 3 

1-97 

48 

32 

66 7 

38,5 

2-95 

106-110 

75 

9.0 

33 3 

1-94 

39 

26 

66.7 

31.1 

5-65 

101-105 

29 

3.5 

36.7 

2-98 

18 

8 

44.4 

37.3 

4-69 

96-100 

8 

0.9 

25 1 

3-42 

2 

2 

100 0 

27 0 

26-28 

91-95 

5 

0.6 

25,8 

7-69 

2 

2 

100.0 

24 5 

14-35 


S35 347 253 72 9 



35 45 55 65 75 85 95 105 115 125 135 145 155 165 

44 54 64 74 84 94 104 114 124 134 144 154 164 174 

Figure 1 

Distributions of the IQ’s in the Termati-Menill Standardization Group 
and the Oberlin Gtoup 

394 



VALUE OF I.Q ’S FOR PREDICTING COLLEGE SCHOLARSHIP 


cent exceeding the Terman-Merrill mean. The Oberlin sample 
is rather sharply peaked, showing little kurtosis, but it displays 
a slight positive skewness. Variability is much less than that 
of the Terman-Merrill sample, sigma being 10 4 as compared 
with 16.4 for the larger group. 

8. Mean Scholarship of Students of Different I.Q. Levels. 
Table 5 reports the numbers and proportions of students of 
the different levels of I.Q. with their (1) mean freshman 
scholarship rank, (2) mean senior scholarship rank, (3) the 
range of scholarship achievement for those at each level, and 
(4) the proportion of those in college long enough to have at¬ 
tained senior status who did so, for each I.Q. level. Exami¬ 
nation of the table reveals the following salient facts: 

(a) As indicated by the correlation coefficients previously 
noted, the general tendency is for those of higher I.Q. to 
make the better scholastic records. 

However, (b) the range of scholastic performance is, 
with few exceptions, remarkably similar at each test level. 
Freshman achievement in the highest and the lowest deciles 
is recorded for students with I.Q.’s ranging all the way from 
105 to 140, although no student with an I.Q. below 111 
achieved a top tenth ranking for the entire college course. 
There was one student with an I.Q. of 105 who achieved a 
proportional rank of 98 in freshman scholarship and has been 
in the upper tenth of her class in each of the two subsequent 
years Her centile score, according to state norms, on the 
OSU Test is, however, 71, so the later test is evidently a more 
accurate index of her intellectual ability. 

(c) The four students with I.Q.’s above 150 all made 
exceptionally good records. 

(d) Sufficient time has elapsed to permit but four stu¬ 
dents whose I.Q.’s are below 101 to become seniors. They 
have all obtained the A.B. degree, but in only one instance was 
this achieved in the normal four-year period. By persistent 
effort, however, they did finish the course, and all of them 
ranked above the lowest decile of their class. This is com¬ 
parable with Adams’ finding at the University of Texas. 

395 



EDUCATIONAL AN]) PSYCHOLOGICAL MEASUREMENT 


(e) The retentive power of the college was not mate- 
rially greater for those at the higher than for those at the 
lower extremes of the distribution 

(f) Some degree of selectivity is indicated, however, by 
a comparison of those with I.Q.’s above 126 with those whose 

I.Q.’s are below 116. Of the 93 in the group with the higher 
I.Q.’s, 74, or 79.57 per cent, persisted, whereas but 70, or 
64.22 per cent, of the 109 with the lower I.Q.’s persisted to 
the senior year. As the ratio of the difference in these pro¬ 
portions to the standard error of the difference is 2.47, it is 
fairly significant. 

Summary 

1. I.Q.’s were available for 835 cnteiing freshmen and 
for 253 of these who had reached the senior year, the scores 
having been derived from the following group tests: Otis, 
Terman, Henmon-Nelson, National and Kuhlmann-Ander- 
son. Scores on the OSU Psychological Examination were also 
obtained. 

2. The difference in the power of the diffeient I.Q tests 
to predict college scholarship was not statistically significant. 

3. The I Q’s constitute a better basis for predicting col¬ 
lege grades than they do for prognosing total high school 
scholarship. This is also true of the OSU Test scores, 

4. The OSU Test is more successful than any one of the 
other tests in predicting scholarship in high school as well as 
in college. 

5. The OSU Test taken during Freshman Week corre¬ 
lates more closely with both secondary and college scholar¬ 
ship than does the same test taken during the senior year 
in high school. 

6. The OSU Test predicts freshman scholarship better 
than it docs total college scholarship. 

7. The average I.Q. of the freshmen is 121. The aver¬ 
age obtained by the Otis, Terman, and Henmon-Nelson tests 
is virtually the same. The averages for the small number 
tested with the National and Kuhlmann-Anderson tests are 
127 and 125, respectively. 


396 



VALUE OF I.Q.’S FOR PREDICTING COLLEGE SCHOLARSHIP 


8. The average I.Q. for the seniors is also 121. As the 
mean OSU Test score for the seniors is but one percentile 
point highci than that for the freshmen, it is evident that 
virtually no selection occurs during the college course, so far 
as test intelligence is concerned. To be sure, there is some 
selection in terms of scholastic lecord, there being a supe¬ 
riority of 5.46 points in the freshman scholastic rating of 
those who persisted over those who did not become seniors. 
The critical ratio of this difference is 2.73. 

9. The I.Q.’s of the Oberlin freshmen range from 92 to 

169. 99 per cent of the I.Q.’s arc over 100. Variability 

is represented by a sigma of 10 4, as compaied with 16.4 
for the Terman-Merrill standardization group. 

10. Although the correlation between I.Q. and college 
scholarship is .40, the range of scholastic performance is re¬ 
markably similar at the different test levels between 101 
and 140. 

11. Four students with I.Q.’s between 91 and 100 became 
seniors, but their records were not brilliant. 

12. Although the retentivity of the college was not mate¬ 
rially greater for those at the extremely high end of the 
distribution than for those at the lower end, 80 per cent of 
those with I.Q.’s above 126, as compared with 64 per cent of 
those with I.Q.’s below 116, who had been in college long 
enough, became seniors. 

Conclusions 

Two facts of general significance emerge from the compu¬ 
tations : First, the figures indicate that, although it is to be 
expected that students with higher intelligence test scores will 
make the better college records, it is nevertheless possible for 
the average of the group with I.Q.'s as low as 101-105 to do 
acceptable work at Oberlin. There are indeed exceptional 
students who, in spite of the handicap of an intelligence quo¬ 
tient as low as 92, obtain the A.B. degree. Second, test scores 
show a consistently closer correlation with college scholarship 
than with high school records. Interpretation of these facts 
would seem to point to the significance of adequate motiva- 

397 



EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


tion. Possessed of determination, drive, and directionality, 
the student whose intellectual ability barely equals the average 
of the general population can “make the grade”, if equipped 
with a good secondary school preparation. Selection of the 
student body at Oberlin is made primarily on the basis of the 
high school record. Students with low I.Q.’s, who rank in 
the lower half of their high school class, are not admitted. 
The higher validity figures obtained when college scholarship 
is used as the criterion also emphasize the factor of motiva¬ 
tion. Oberlin students, at any rate, apparently work more 
nearly up to their potential capacity, so far as this is measured 
by the intelligence tests, while in college than in secondary 
school. 

REFERENCES 

1. Adams, F. J. “College Degrees and Elementaly-School Intelligence 
Quotients”, Journal of Educational Psychology, XXXI, (1940), 
360-368. 

2. Byrns, R. and Hcnmon, V. A. C. “Long Range Prediction of Col¬ 
lege Achievement”, School & Society, XLI, (1935), 872-880. 

3. Hartson, L. D. "Further Validation of the Rating Scales Used with 

Candidates foi Admission to Ohcrlin College”, School Society, 

XLVI, (1937), 155-160. 

4. Terman, L. M. and Meirill, M. A. Measuring Intelligence. Bos¬ 
ton; Houghton Mifflin Company, 1937. 


398 



THE THURSTONE PRIMARY MENTAL ABILITIES 
TESTS AND COLLEGE MARKS 

MARY LOU ELLISON 
and 

HAROLD A. EDGERTON 
Ohio Stafe University 

T HE PRESENT STUDY of Thurstone’s Primary Men¬ 
tal Abilities Tests has been made in order to implement 
the assumption that the scores of the several factors might 
be useful in academic counseling. Four questions form the 
basis for the investigation. 

1. What relationships are there between the factor scores 
and academic grades? 

2. What relationships are there between the Ohio State 
University Psychological Test score and the factor scores? 

3. How well can academic grades be predicted on the 
basis of the primary factor scores? 

4. Are the factor scores related to grades in specific col¬ 
lege subjects? 

Thurstone’s development of his Primary Mental Abilities 
Tests was for the purpose of appraising seven primary fac¬ 
tors of mmd 1 His isolation of these factors and the devel¬ 
opment of the final test battery is described in the monograph 
“Primary Mental Abilities.’ 12 Thurstone briefly describes 
the factors on his individual record sheet for the tests as 
follows: 

"Factor P, The tests that call for this ability lequire the quick 
pciception of detail in either visual oi verbal material. This seems 

1L, L Thurstone, Manual of Instructions foi Administering Tests for Pri¬ 
mary Mental Abilities, p. 2. 

2L. L. Thurstone, “Primaiy Mental Abilities,” Psychometric Monogiap/u 
Chicago: The University of Chicago Press, 1 (1938), 

399 



EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


to be a perceptual ability which enables some people to excel in 
finding detail which is significant to them or detail which they are 
seeking. It is probably one of the factors that is involved in what 
has been called ‘quick intelligence. 1 Scanning a page to find quickly 
some small but significant detail anti classifying familiar objects 
quickly are examples of this factor. 

"Facto? N This is one of the clearest factois that has been 
isolated. It consists of facility with simple numerical woik and is 
best lepresented In the tests of iapid calculation. It is of secondary 
impoitance in arithmetical leasoning and in deciphenng numeiical 
code, tasks which call foi factors in addition to facility with num¬ 
bers as such. It is not yet known whether this factor can be exem~ 
plified in noil-numerical tasks. 

"Factor V. This is a verbal factor which is manifested in tests 
that involve the interpretation of language. It is not restricted to 
mere fluency with woids. It reflects an ability to deal leadily and 
quickly with verbal material. Those who excel in this factor arc 
probably veibally-minded in theii ihinking and pioblem-solving. 

"Facto? S . This is an ability that is present in those tests which 
lequire the subject to think visually of geometrical forms and of 
objects in space. While none of these factors can be described in 
detail yet, it seems reasonable to expect that those who have a high 
rating on ability S should be able to do well in those studies and in 
those occupations that require visualizing 01 thinking about things 
in visual form. Many people think about a problem visually even 
when the nature of the problem docs not immediately suggest any 
necessary visual character. 

"Factor M. The natuie of this factoi was identified by the 
fact that all of the tests which requiie it are tests of memorizing. 
The appearance of such a factor seems to give justification for the 
belief that a good memory is an ability independent of other mental 
powers. It is not yet known, however, whether the ability to 
memorize is the same as the ability to lecall experiences which we 
do not intend to letain for futuic recall. The present factor M can 
be tentatively named the ability to memorize. 

rf Factor L The tests which lequire this factor demand that 
the subject discovei some rule or principle in the mateiial of the 
test. Tile factor does not seem to be restricted to mateiial which is 
primarily numerical, primarily visual, 01 primarily verbal, types 
which were all represented in the tests for this factor. The ability 


400 



THURSTONE PRIMARY MENTAL ABILITIES TESTS 


to discovci a mlc oi principle in the solution of a problem is usually 
called induction, People differ markedly in the kind of resource¬ 
fulness that is involved in inductive thinking, and the hypothesis that 
the factor I is associated with this kind of ability seems plausible, 
It is not known whether this factoi is associated with inventiveness 


and initiative. 

"Factor D. The deductive factor is still only tentatively iden¬ 
tified. It is a factoi which is present in syllogistic reasoning and 
also in some other tests. It is one of several factors that may be 
involved in restrictive thinking. In a general description, the factoi 
seems to represent facility in formal reasoning . 15 
In the present study, Thurslone's Primary Mental Abili¬ 
ties Tests, Experimental Edition were used 

The subjects consisted of a group of 49 students in the 
College of Arts and Sciences, Ohio State University. Most 
of those who took the test were students in the Exploratory 
Program of the College of Arts and Sciences. 

The students tested do not constitute a random sample 
of students of the Exploratory Program, nor of the College 
of Arts and Sciences, nor of freshmen generally. This fact 
must be taken into consideration in the interpretation of the 
results of the study. No one was required to take the test. 
Of the forty-nine subjects, forty-one were freshmen, six were 
sophomores, and two were juniors In the group, 39 per cent 
ranked in the 90th percentile or above in intelligence (Ohio 
State University Psychological Test), and 54 per cent were 
included in the 80th percentile or above. The mean Point 
Hour Ratio 3 was 2.40. 

In addition to the scores for the seven factors, and the 
separate scores on the sixteen individual tests from which 
the factor scores are derived, other data from the college rec¬ 
ords were used. Intelligence test percentiles were based on 
scores received in the Ohio State University Psychological 
Examination, given to all students at the time of entrance 


»The Point Hour Ratio is the total points divided by the hours 
For each hour of grade A, four pointeare At 

points; C, two points; D, one point, and E (fai e), P 


attempted 
of B, three 


401 



EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


to the University. The Point Hour Ratio for each student 
was obtained Grades received in English, sciences, foreign 
languages, and psychology were recorded, since it was thought 
that each of these groups might be related differentially to 
the factor scores. In the group of forty-nine students, English 
grades were available for twenty-seven, science grades for 
thirty, foreign language grades for twenty-seven, and psy¬ 
chology grades for twenty-five. 

1, What relationships are there between the factor scores 
and Point Plonr Ratio? 

In Table 1, the correlations between Point Hour Ratio 
and the various factors are shown. The correlation between 
Factor V and Point Hour Ratio is the highest (0.44). Fac¬ 
tor M ranks second in its correlation with P. IT. R., the corre¬ 
lation being 0.31. The other five factors have correlations 
with P. IT. R. ranging from —0.24 to 0.19. One might specu¬ 
late on the meaning of the negative correlations, but on the 
basis of such a sample it might be unfortunate. 

It is likely that in a really random sample of University 
students or of University freshmen such correlations would 
be zero or positive. 

The multiple correlation between P. IT, R. and the 
weighted scores of the seven factors is 0.640. When the 
Ohio State University Intelligence Test score is included as a 
variable with the seven factors, the multiple correlation Is 
0.648. Such a correlation suggests that there may be some 
justification for the use of the Primary Mental Abilities Tests 
for the prediction of academic success in college. 

2. JVhat relationships arc there between the Ohio State 
University Psychological Examination scores and the factor 
scores? 

As in the case with Point Hour Ratio, Factor V shows 
the highest correlation with intelligence (0.52). This is 
perhaps due to the fact that the expression of intelligence is 
largely verbal in character in present tests. The Same- 
Opposite Test, a component of Factor V, shows the highest 
correlation of the several sub-tests with intelligence test 


402 



THURSTONE PRIMARY MENTAL ABILITIES TESTS 


TABLE 1 

COMPARISON OF CORRELATIONS OF FACTORS AND INDIVIDUAL TESTS 
WITH INTELLIGENCE TEST SCORES AND POINT HOUR RATIO 

(N =49) _ 

Intelligence Tes£ Point Hour Ratio 


Composite Individual Composite Individual 
Facto i Test Factor Test 


Factor P . 

0.06 


-0.24 


Identical Forms. 


-0 05 


-0.21 

Verbal Enumeration .... 


0.16 


-0.27 

Factor N. 

-0.02 


0.17 


Addition . 


-0.07 


-0.07 

Multiplication. 


0.01 


0.29 

Factor V. 

0.52 


0.44 


Completion. 


0.41 


0.33 

Same-Opposite. 


0.55 


0.37 

Factor S . 

-0 11 


-0.21 


Figures . 


-0.12 


-0.40 

Cards . 


-0 07 


-0 01 

Factor M > ...... .. ... 

0.28 


0.31 


Initials . 


0.34 


0 32 

Word-Number. 


0.09 


0.17 

Factor I. 

0.11 


-0.13 


Letter Grouping. 


0 24 


-0.18 

Maiks . 


0.04 


-0.32 

Number Patterns . 


0.07 


-0.23 

Factoi D. 

0.10 


0.19 


Arithmetic . . . 


0.09 


0.10 

Number Scries. 


0.36 


0.35 

Mechanical Movement . . 


-0.13 


0 04 


scores. A somewhat similar test is found in the Ohio State 
University Psychological Test. 

The correlation of Factor M with intelligence is 0.28. 
The Initials Test correlated 0.32 with intelligence, while the 
other component of Factor M, the Word-Number Test, cor¬ 
related very low (0.09). 

The correlation of factors P> 1) anc ^ U wlt ^ i nte lhg ence 
are positive, but are very low. Among the components of 
Factor D, the Arithmetic Test has a low correlation, with 
intelligence (0.09), the Number Series Test has one of the 
highest correlations in the battery with intelligence, an t e 
Mechanical Movements Test shows a negative corre ation 


403 





















EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


TABLE 2 


INTKRCORRKI.ATIONS BETWEEN POINT HOUR RATIO, PSYCHOLOGICAL 
TEST, AND THE SEVEN FACTORS 

(N =49) 




(Intelligence) 

O.S.U. 

Psych, 

P.1I.R. Test P 

N 

V 

s 

M 

I 

D 

p. II. 

R.. 


0,20 

-0,2+ 

0.17 

0.44 

-0,21 

0.31 

-0.13 

0.19 

o. s. a 










Psych, 

Test 0.20 


0 06 

-0.02 

0.52 

-0,11 

0.28 

0.11 

0.10 

P 


-0,24 

0.06 


0.3+ 

0 27 

0 54 

0.14 

0 47 

0.10 

N . 


0 17 

-0,02 

0.34 


0.43 

0 18 

0.28 

0,38 

0.42 

V . , 

, 

0.44 

0.52 

0 27 

0 +3 


0 12 

0.33 

036 

0.3+ 

s ... 


-0,21 

-0,11 

0.5+ 

0 18 

012 


0.08 

0.34 

0.19 

M . 


031 

0,28 

0.14 

0.28 

0 33 

0 08 


0.13 

0 14 

I ... 


-0.13 

0.11 

0.47 

0.38 

0.36 

0.34 

0.13 


0 17 

D . 


0.19 

0 10 

0.10 

0.42 

0,34 

0 19 

0.14 

0.17 


Mean 


2 40 

71.6 

139.9 

107.1 

76.9 

108.7 

15 1 

30.7 

72 

Standard 










Deviation 

0.59 

20.3 

25.1 

36.4 

23.8 

36.4 

7,6 

7.8 

2.4 

with 

intelligence. 

Such 

correlations 

might 

raise 

a question 


regarding the functional unity of the factors 


3. How well can Point Hour Ratio be predicted on the 
basis of the primary facial scores? 

It would be desirable to he able to predict the probable 
P. H. R. of a student from the scores made on the seven 
factors. The chart below shows that in both situations, the 
highest beta weight is that for Factor V. 

TABLE 3 

BETA AND b REGRESSION COEFFICIENTS 

For the Scenes When the OSU Psychological Test is Included 
and When It Is Omitted From the Test Battery 

OSU Intelligence Included OSU Intelligence Omitted 


Beta b Beta b 

Coefficient Coefficient* Coefficient Coefficient 

Factor P . -,279 -,007 -.291 -.007 

Factoi N.034 .001 .090 .001 

Factor V.568 .014 .487 .012 

Factor S . -.113 -.002 -.089 -.001 

Factor M .216 .017 .191 .015 

Factor I. -.196 -.015 -.201 -.015 

Factor D.046 .045 .040 .039 

OSU Psychological 

Examination .-.136 —.004 


404 



















TIIURSTONE PRIMARY MENTAL ABILITIES TESTS 


The correlation with P. H. R. is increased slightly when 
the intelligence test rating is used, the correlation between 
P. H. R. and the variables being raised from 0.640 to 0.648. 
In a random sample of freshmen this difference would prob¬ 
ably be greater. 

4. Are the factor scores related to grades m specific 
college subjects? 

The correlations of course grades with Point Hour Ratio, 
intelligence test scores, and the seven factors are found in 
Table 4. The grades taken into consideration in this study 
are those in English, science, foreign languages, and psychol- 

TABLE4 

THE CORRELATION OF SUBJECT MATTER CRADBS WITH POINT HOUR 
RATIO, INTELLIGENCE, AND THE SEVEN FACTORS 


Foreign 

English Science Language Psychology 

Grade Grade Giade Grade 


P. H. R. 


HIKES 

0.77 

0.58 

Intelligence. 



0.54 

mmm 

Factor P. 

mmm 

-0.12 

0 27 

■hesuh 

Factor N. 

0.34 

0.03 


0.37 

Factor V . 

. 0.75 

0.68 


0.59 

Factor S. 

.. 0 44 

0.23 

0.56 


Factor M. 

. 0.42 

0.18 

0.45 

0.23 

Factor I. 

. . 0.24 

0.05 

0.78 


Factoi D - , . . « , 

0.44 

0.23 

0.43 

0.63 

Number of Cases. , , 

. 27 

30 

27 

25 

ogy. The results 

must be 

taken as 

suggestive 

and not as 


facts from which broad generalizations may be drawn. 

In all four cases, there are high correlations between P. 
H. R. and grades. This is to be expected, since these grades 
are components of the Point Hour Ratio. 

English grades correlate highest with Factor V (0.75). 
Factors S, M, and D also show correlations above 0.40 with 
English grades. 


405 














EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


The only factor showing a correlation above 0,40 with 
science grades is factor V, 

All the factors are apparently important in determining 
foreign language grades, since all factors except Factor P 
correlate above 0.40 with foreign language. The most 
significant correlation with foreign language is Factor I 
(0.78). This correlation is higher than one would expect, 
but it may be due to the fact that an inductive method is used 
at Ohio State University in teaching the beginning language 
courses. 

The highest correlation among the factors with psychol¬ 
ogy grades is with Factor D (0.63). Factor V is also high 
(0.59). These were the only two factors correlating above 
0.40 with psychology. 

Factor V correlates above 0.40 with grades in each of 
the' four subject fields considered, the highest being with 
English grades. The correlations between Factor I and the 
school subjects are low with the exception of foreign lan¬ 
guage grade (0.78). Factor P shows very low correlations 
with all four school grades. There is little differentiation 
between the correlations of the school grades and Factor N, 
the only correlation higher than 0.40 being with foreign 
language grade. Factors S and N both have correlations over 
0.40 with English and foreign language grades, and Factor D 
has a significant correlation with English, foreign language, 
and psychology grades. 

Such observations as reported here suggest that, with 
more experience, the Thurstone Primary Abilities Test will 
become a useful instrument in the academic counseling program 
of colleges. It will be necessary to secure more data in regard 
to the relationships of test scores and course grades from a 
random sample of freshmen. Also, it will be important to 
have some knowledge of methods of instruction in the several 
courses so as to judge whether the relationship observed is a 
function of the abilities of the student and the subject matter 
being studied, or of the methods of instruction. 


406 



A SHORT CUT IN THE ESTIMATION OF 
SPLIT-HALVES COEFFICIENTS 


CHARLES I. MOSIER 
Social Security Board 

F OR SEVERAL YEARS the writer has availed himself 
of a short cut in the computation of reliability co¬ 
efficients by the split-halves technique. The method has prob¬ 
ably been developed independently by a number of other 
investigators, but it has not, to the writer's knowledge, 
appeared in print in connection with this specific problem, 
and there may be some workers to whom it may prove useful. 

With the development of the Kuder-Richardson method 
for the determination of reliability, the split-halves technique 
should probably disappear from the scene. However, as a 
number of investigators have found, it provides a fairly close 
approximation to the Kuder-Richardson value, and since it 
does not require an item-analysis it will probably continue in 
use. In any event, the purpose of this note is not the justifica¬ 
tion of the technique, but the presentation of a short cut. If 
split-halves coefficients are to be computed, they may as well 
be computed efficiently. 

In brief, the short cut involves the use of the complete 
dependence of the “even” scores on the “total” and the “odd” 
scores. We may suppose that “total” scores have already 
been obtained in connection with the original purpose of the 
test. Because of this algebraic dependency, then, it remains 
only to rescore the papers for the “odd” scores in order to 
know the even scores, since, 

T t = 0 { (1) 

Furthermore, equation (1) need not be applied to each case 
separately. Not only may we dispense with the necessity of 

407 



EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


rescoring the papers to get the “even” score for each indh 
vidual; we may go farther, and dispense with the necessity 
of obtaining the individual “even” scores at all. 

By defining the value desiied, namely, the correlation be¬ 
tween O and E, and substituting the value of E expressed as 
a function of O and T from equation 1, we obtain the result 
that 


r nT CT o _ 

V (Tji T 00 2 7 02> 00 Of 

This expression calls for only the “odd and “total” scores, 
from which are obtained their respective standard deviations 
and the correlation between them. The value obtained from 
the odd-even correlation can then be used in the Spearman- 
Brown formula to give the estimated reliability. This value 
is identical with that which would be obtained if each test 
paper were independently scored for the “even” score, and 
the odd-even correlation coefficient computed from the re¬ 
sulting data. (This can be seen from the derivation.) 

The expression in equation (2) is readily recognized as 
a special case of the mote general formula for a correlation 
between a part and the whole exclusive of the part. O, E, 
and T in equation (2) are by no means limited to “odd,” 
“even” and “total" scores, but apply to any set of variables 
for which equation (1) is true. 


408 



MEASUREMENT ABSTRACTS* 

Adams, C. R. “A New Measure of Personality.” Journal of 
Applied Psychology, XXV (1941), 141-151. 

A new instrument for measuring personality traits, The 
Personal Audit, is described. It was intended to be relatively 
free from highly personal items since it was felt that such a 
test would be more useful in non-clinical situations. The Per¬ 
sonal Audit is believed, on the bads of low intercorrelations 
between sub-tests, to measure 9 relatively independent person¬ 
ality traits. Coefficients of reliability (corrected split-half) 
range from +.90 to +.96. Items have been validated by the 
criterion of internal consistency using a modified version of 
the Sletto technique W. A. Varvel. 

Baxter, B. and Paterson, D. G. “A New Ratio for Clinical 
Counselors.” Journal of Consulting Psychology, V 
(1941), 123-126. 

The magnitude of the S.E. M , which clinical counselors 
employ in interpreting test scores, varies in significance with 
the variability (S D.) of the norm group. It is useful, there¬ 
fore, to relate them in a ratio as an aid in interpreting 
scores. The following formula, in which r is the reliability 
coefficient, provides a simple way of expressing the magni¬ 
tude of S.E. m as a percentage of S.D .. 

S.E. U S.D. V"T=7 - 

td7= — m —= vi ~ r 

Application of this ratio to a list of 49 tests shows that 

+E. jV ran „ es f rom as low as .10 to as high as .55. In gen- 
S.D. 

eral, achievement tests show the lowest ratio (highest accur¬ 
acy) with an average of .20, followed in order by scholastic 
aptitude tests (averaging .30), reading tests (.32), special 


409 


•Edited by Forrest A, Kingsbury 



EDUCATIONAL AN1) PSYCHOLOGICAL MEASUREMENT 


aptitude tests (.33), and personality tests (.27 to .55, aver¬ 
age .40), F. A. Kingsbury, 


Bennett, G. K. and Raskow, S. “Extension of the Norms of 
the Columbia Vocabulary Test ” Journal of Applied 
Psychology, XXV { 1941), 48-51. 

Constructed and standardized for grades 3-8, this test 
showed a mean score of 54 and standard deviation of 15 for 
the latter half of the eighth grade. Extension of norms seems 
justified when 1212 superior recent high school graduates 
obtained a mean score of 74 with standard deviation of 12. 
When the test was administered to 5101 high school students, 
mean scores increased and the standard deviation decreased 
from grade 9A through grade 12B both among commercial 
and general-course students. Decreasing standard deviation 
probably indicates increasing homogeneity of vocabulary in 
later school years. With grade constant, mean scores decrease 
with age. J. E. P. Libby. 


Blum, Milton L. and Candee, Beatrice. “The Selection of 
Department Store Packers and Wrappers with the Aid of 
Certain Psychological Tests: Study II.” Journal of Ap¬ 
plied Psychology, XXV (1941), 291-299. 

This study is a check on conflicting results in previous 
attempts to determine the value of finger dexterity tests in 
predicting successful wrappers or packers. Tests used were 
the O'Connor Finger Dexterity, Zeigler Placing, Otis Self- 
Administering, and Minnesota Clerical. Test performance was 
checked against production records and foreman’s ratings. 
Results indicate no relation between finger dexterity and pro¬ 
duction for cither packers or wrappers. In the experienced 
group the Minnesota Clerical shows positive correlation for 
both groups. It is concluded that clerical speed and accuracy 
have a much higher relation to production than has finger 
dexterity. D, A. Peterson. 


410 



MEASUREMENT ABSTRACTS 


Brown, A. W. and Blakey, R. “A Preliminary Report on the 
Development and Standardization of a Non-Verbal Test 
at the High-School Level.” Journal of Educational Psy¬ 
chology, XXXII (1941), 113-123. 

A series of 11 non-verbal subtests constructed on the con¬ 
cepts of primary mental abilities has been standardized on a 
group of 286 suburban high school students, Eight of these 
subtests, two of perceptual speed, two of spatial relations, and 
four of abstract reasoning, constitute the final test, which may 
be given in forty minutes. The “Non-Verbal Reasoning Test” 
correlates with school grades .47 and with Otis I.Q. .59; Otis 
I.Q. correlates with school grades .60. Higher correlation 
with grades was not expected since the latter involve other 
abilities in addition to those in the non-verbal test. Reliability 
of the test is .97. Tentative norms are given, including derived 
scores intended to take the place of I.Q.’s at this level; stand¬ 
ardization on a much larger sample is being undertaken. /. E. 
P. Libby. 


Brown, A. W. and Cotton, C. B. “A Study of the Intelligence 
of Italian and Polish School Children from Deteriorated 
and Non-Deteriorated Areas of Chicago as Measured by 
the Chicago Non-Verbal Examination.” Child Develop¬ 
ment, XII (1941), 21-30. 

1262 Italian and Polish school children in a deteriorated 
and a non-deteriorated area were tested with a non-language 
group test battery. The children were in the fourth grade or 
above and from 10 to 14 years of age. The authors stress the 
influence of socio-economic, cultural, and educational factors 
upon test scores. They found (1) a regular decrease in mean 
test performance from age 10 to age 14 for both sexes and 
both nationality groups but not so great as that previously re¬ 
ported for verbal tests; (2) sexual differences favoring the 
boys, particularly in the case of Italian children; and (3) 
contradictory indications relating to socio-economic community 
level (no significant differences between areas for Italian 
boys; significant differences favoring the deteriorated area for 

411 



EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


Italian girls; a tendency for Polish children in the non- 
delerioratcd area to make better scores). W. A. Fatvel. 


Brush, Edward N. “Mechanical Ability as a Factor in Engi¬ 
neering Aptitude.” Journal of Applied Psychology, XXV 
(1941), 300-312. 

This study was intended to explore the possibilities of 
available tests of mechanical ability and aptitude as indicators 
of aptitude for engineering. The report is prefaced with a 
survey of the relevant literature. The subjects were two groups 
of students in the College of Technology at the University of 
Maine, one group of 104 members, the other group of about 
130 members. The criterion was scholastic rank in courses of 
an engineering nature. The tests used were: Minnesota Paper 
Foim Board, Minnesota Assembly Test, Minnesota Spatial 
Relations Test, O'Connor Worksample No. 1, O’Connor 
Work,sample No. 5, O'Connor Worksample No. 72, Cox 
Mechanical Explanation and Completion Test, Cox Mechani¬ 
cal Models Test, and MacQuarric Test for Mechanical 
Ability. In addition data on intelligence tests, algebra, chemis¬ 
try, plane geometry, and physics tests were also available. 

The conclusions reached are summarized as follows: “The 
tests of useful predictive power were the Cox Tests of Mechan¬ 
ical Aptitude and Minnesota Paper Form Board .... Bat¬ 
teries of mechanical ability tests yield correlations with the 
criterion of about .40; batteries m which an intelligence test is 
combined with one or two tests of mechanical ability yield 
correlations of about .50 ... . seveial batteries of mechanical 
ability tests predict engineering scholarship at least as well as 
the intelligence tests, while the achievement tests, singly and in 
combination, predict success in engineering studies somewhat 
[better than do the tests of mechanical ability .... total 
engineering record is more highly correlated with first 
semester and first year grades than with any test or combina¬ 
tion of tests.” /. E. Karlin. 


412 



MEASUREMENT ABSTRACTS 


Burtt, H. E. and others. “Market Problems and Market Re¬ 
search.” Journal of Consulting Psychology, V (1941), 
No. 4, 145-193. 

This entire number is devoted to eight papers on market 
research, not separately abstracted here because of space 
limitations. The authors and titles are as follows: “Current 
Trends in Marketing Research” (H. E. Burtt) ; “Proving 
Ground on Public Opinion” (H. G. Weaver); “Problems of 
Sampling in Market Research” (Frank Stanton); “Charac¬ 
teristics of the Question as Determinants of Dependability” 
(J. G. Jenkins) ; “Evaluating the Effectiveness of Advertising 
by Direct Interviews” (P. F. Lazarsfeld) ; “Effects of Re¬ 
peated Interviewing on the Respondent’s Answers” (F. D. 
Ruch) ; “The Museum Technique Applied to Market Re¬ 
search” (G. K. Bennett) ; and “The Role of Psychological 
Interpretation in Market Research” (A. W. Kornhauser). 
F. A. Kingsbury. 


Casanova, T. “Analysis of the Effect upon the Reliability 
Coefficient of Changes in Variables Involved in the Estima¬ 
tion of Test Reliability.” Journal of Experimental Educa¬ 
tion, IX (1941), 219-228. 

The following topics are discussed and various formulae 
developed in detail; (1) the variance of the halves in the 
split-half method of estimating reliability; (2) the correction 
for guessing with specific reference to the reliability of rights 
and wrongs, the variance of rights and wrongs, the correlation 
of rights with wrongs, the variance of the number of items 
attempted, and the number of possible choices; (3) the effect 
of calling all negative scores zero; (4) the variance of the 
items. In the latter case, a formula for estimating the re¬ 
liability of a test in terms of the item variances is presented 
which is felt to be more convenient than the Kuder-Richardson 
formulae. TV. R. Varvel. 


Cattell, Raymond B., Feingold, S. Norman, and Sarason, Sey¬ 
mour B. “A Culture-Free Intelligence Test: II. Evaluation 


413 



EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


of Cultural Influence on Test Performance,” Journal of 
Educational Psychology, XXXII (1941), 81-100. 

A culture-free intelligence test described in an earlier paper 
was administered together with the Binet (Terman-Merrill), 
A.C.E. (arithmetical sections), and the Arthur Performance 
Tests, to four comparable groups; these were given special 
training, one group in each class of information or skill de¬ 
manded by the tests. Retest analyses showed the Arthur least 
influenced by training in its own culture medium, the Culture- 
Free Test next, Binet next, and A.C.E. most influenced. The 
above tests and the Ferguson formboards were administered 
to a group of adult immigrants, resident in this country about 
one year, and a control native group. The Ferguson was very 
close to the Arthur, others followed in the order noted earlier, 
when the groups were retested after 77 days during which the 
immigrants gained noticeably in Americanization. Reliability 
of the Culture-Free Test compares favorably with those of the 
others. Adequate validity is indicated by the Culture-Free 
Test’s high loading in the general factor brought out by 
tetrads, and by its high mean correlation with the pool of 
tests. Since life experience probably brings factors in the Cul¬ 
ture-Free Test to saturation in widely different cultures, its 
proper application appears broader than that of preceding 
tests. J, E. P. Libby, 


Driver, Randolph S. “The Validity and Reliability of Rat¬ 
ings.” Personnel, XVII (1941), 185-191. 

Rating is of value in industry only when its limitations as 
a scientific instrument are fully appreciated. The various cur¬ 
rent methods of obtaining measures of validity and reliability 
are discussed and their values and limitations considered. In 
order for a rating to be acceptable, it must be proven valid 
and reliable. Although difficult to accomplish, ratings are not 
useless, but great caution must be observed in their interpreta¬ 
tion. Virginia Brown. 


414 



MEASUREMENT ABSTRACTS 


Dudycha, George J. “A Suggestion for Interviewing for De¬ 
pendability Based on Student Behavior.” The Journal of 
Applied Psychology, XXV (1941), 227-231. 

College students were divided into groups of extreme 
earliness and lateness, of dependability and undependability, 
on the basis of observation of their behavior in life situations. 
Ten questions on punctuality and persistence and six on de¬ 
pendability, when presented to these contrasting groups, 
elicited responses indicating significant group differences. Since 
these questions appear to be diagnostic in student behavior, it 
is suggested that they be tested for usefulness in employment 
situations for discovering those applicants likely to prove 
dependable. Virginia Brown. 

Dulsky, S. G. “Vocational Counseling. I. By Use of Tests; 

II. By Interview.” Personnel Journal, XX (1941), 16-28. 

The author briefly and critically examines various types of 
standardized tests available to the vocational counselor. He 
concludes that aptitude tests are of no value and personality 
tests of very limited value. Interest inventories, if used prop¬ 
erly, may be helpful. Tests of intelligence and educational 
achievement are approved as being of the most value. He 
advocates greater emphasis on the vocational interview as a 
means of diagnosing personality and motivation and of identi¬ 
fying and evaluating interests. Self-guidance from the study 
of test scores and profiles is impossible. Vocational counseling 
is an individual process, requiring “skilled psychologists” 
rather than “mental testers.” The vocational counselor should 
confine himself to descriptive rather than quantitative reports 
of test and interview results and should only rarely go beyond 
general recommendations to his clients. W. A. Varvel. 

Ebert, Elizabeth H. “A Comparison of the Original and 
Revised Stanford-Binet Scales.” The Journal of Psychol¬ 
ogy, XI (1941), 47-61. 

1434 records of 315 children five to ten years of age were 
studied for information as to the comparability of LQ.’s from 
the original and revised Stanford-Binet Scales. An increasing 

415 



EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


discrepancy between I.Q. values on the two scales was found 
at ages 7, 8, and 9. The new revision tends to give lower 
I.Q.’s for levels below 100 and higher I.Q.’s for levels above 
100. The duller children gain slightly more in I.Q. than the 
brighter ones although both groups show increases. With the 
old revision, the duller individuals gain but the brighter ones 
lose. Although the average I.Q. of the 1916 revision was more 
constant, individuals maintained their relative positions better 
in the new revision. J’irginia Brown. 


Eysenck, II. J. “Type-Factors in Aesthetic Judgments," 
British Journal of Psychology, XXXI (1941), 262-270. 

It has been found previously that the analysis of the inter- 
correlations between the rankings of pictures by a number of 
subjects yields mainly one general factor with no other sig¬ 
nificant factor. On this occasion the attempt is made to bring 
out the influence of any such secondary factor, even, if need 
be, at the expense of the “T” or general factor. Five series 
of pictures, each consisting of thirty to fifty items, were judged 
in order of goodness by fifteen subjects. The subjects were 
artists, university students, bank clerks, typists, and teachers, 
eight women and seven men, with age range from 20 to 70. 
The table of correlations for each of the five series was fac¬ 
tored and two significant factors extracted in all cases except 
one. One factor was the “T” factor previously identified; the 
other factor, called the “K” factor, seemed to divide the 
population into two different “types,” one preferring the mod¬ 
ern, and the other the older style of painting. This factor, 
identified provisionally with “brightness,” correlated with 
extroversion, radicalism, youth, and possibly with preference 
for color, The color-form test also appeared to be correlated 
with extroversion. Results are definite enough to suggest that 
further research into the relation between temperament and 
aesthetic preferences will not only extend knowledge of the 
“type” factors in aesthetic judgments, but also increase under¬ 
standing of temperamental “types.” J. E. Karlin. 



MEASUREMENT ABSTRACTS 


Ferguson, L. W. “A Study of the Likert Technique of Atti¬ 
tude Scale Construction.” Journal of Social Psychology, 
XIII (1941), 51-57. 

The suggestion is here examined that Likert’s method of 
constructing and scoring attitude scales gives results as valid 
as those of the method outlined by Thurstone and Chave with 
much less labor. Items constructed by the former method 
(Minnesota Scale for the Survey of Opinions) were rescaled 
by the latter; standard deviations of the distribution of scale 
values indicate that such items are adequately scaled by this 
method. The scale values obtained indicate that Likert’s 
technique does not obviate the need of a judging group With 
one exception, the scales cannot be scored by the Thurstone 
method. Scores obtained by the two methods for the excep¬ 
tional scale show a correlation of .70, confirming the conclu¬ 
sion. I. E. P. Libby. 

Greene, E. B. Measurements of Human Behavior. New York: 
The Odyssey Press, pp.777. 1941. 

This volume of 24 chapters is divided into three parts' 
Part I, “Basic Considerations” (discussing introductory con¬ 
cepts, varieties of appraisals, score-interpretation, measures of 
relationship, types of instruments, item construction and 
evaluation, factor analysis) ; Part II, “Instruments and Re¬ 
sults” (tests of early childhood, of achievement, Binet-type 
and group intelligence scales, performance, mechanical and 
motor tests, measures of fine arts—design, literature, and 
music—tests of interests, attitudes, adjustment); Part III, 
“Persistent Problems” (effects of practice on scores, measures 
of growth and senescence, absolute scaling, evaluation of judg¬ 
ments, native differences). A 30-page bibliography, a com¬ 
bined glossary and subject-index, 121 tables, and 108 figures 
are features of the book. F. A. Kingsbury. 


Guilford, J. P. “A Note on Dubois’s Method of Deriving an 
Achievement Ratio for Students.” Journal of Educational 


Psychology, XXXII (1941), 220-222. ( 

Dubois’s achievement ratio is that of the student s actua 


417 



EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


average mark to that mark corresponding to the standard score 
obtained on a psychological test; these ratios in general are 
low for students with high test scores and high for those with 
low scores. This finding follows from the assumption of a 
correlation of 1.00 between test scores and marks, while 
Dubois gives the correlation as .442. A student may be ex¬ 
pected to deviate from the mean mark by only 442 as much 
as his standard score indicates. It is suggested that Dubois’s 
conclusion might he reversed if the regression line r equals 
.442 he taken as base. Computation of a special but meaning¬ 
ful case confirms this suggestion. J. K. P. Libby . 


Hay, Kdward N. "Tests in Industry. 1 ' Personnel Journal, 
XX (1941), 3-15. 

This is a discussion of the opportunities for psychologists 
in industry. The use of intelligence tests is coming more and 
more to act as a check on employer’s judgment, which is cus¬ 
tomarily biased in favor of the qualities of aggressiveness and 
good personality. Such tests indicate the level at which the 
employee is able to work most efficiently and his potentialities 
for further promotion. It is particularly important to obtain 
psychological information about an employee at the time of 
his entry into a firm since the work he does then determines 
to a large extent his opportunities for advancement. A be¬ 
ginning job may require an I.Q. of about 100 but higher 
positions require higher I.Q.’s so that an employee progress¬ 
ing reasonably well in the initial job may become unfit when 
advanced to the more complex positions. It becomes advisable 
to judge prospective employees not on the basis of the intel¬ 
ligence required for their first positions but for the positions 
to which they should be able to rise. With the use of objective 
tests, information becomes generally available for an entire 
firm so that transfers and promotions from one department to 
another can be advised with a minimum of further consulta¬ 
tion, since the qualities required in other work are known and 
the abilities of the employee are likewise known at the time of 
first testing. Apart from the question of job maladjustment 

418 



MEASUREMENT ABSTRACTS 


there is a further fruitful field for the industrial psychologist 
in the problem of a better supervisor-employee relationship 
An illustrative study of these methods at work accompanies 
the discussion. /. E. Karlin. 

Johnson, Donald M. and Reynolds, Floyd. “A Factor Analysis 
of Verbal Ability.” Psychological Record, IV (1941), 
183-195. 

The literature on problem solving among animal and 
human subjects suggests that there may be two fundamental 
processes involved: “F,” the flow of various acts or responses; 
and “S,” the selection of these responses according to the 
requirements of the problem. This study tested the hypothesis 
that individual differences in these two processes is a major 
determinant for scores on problem-solving tests. This investi¬ 
gation was limited to verbal problems. There were ten tests 
involving the supplying of verbal responses; the tests varied 
in restriction of choice of responses from complete freedom to 
supply any word to restriction to the supplying of only certain 
words according to a rigid criterion. The subjects were 113 
summer-school students at Fort Hays Kansas State College 
A centroid analysis of the table of corrected correlation co¬ 
efficients yielded two factors. The tests fell within a positive 
manifold, after rotation, indicating two definite factors reason¬ 
ably identified as the “F” and “S” postulated in the hypothesis. 
It appears that these two factors are probably closely related 
to, if not identical with, Thurstone’s “W” and "V” factors. It 
is concluded that the two processes or functions mentioned 
account to a large extent for the variance in verbal problem¬ 
solving tests. These findings are further discussed with refer¬ 
ence to tests of vocabulary, intelligence, and reading. J. 


Karlin. _____ 

Kornhauscr, A. W. and Schultz, R- S. (et al). “Research on 
Selection of Salesman” (and other papers ). Journal of 
Applied Psychology, XXV (1941), No. 1, 1-47. _ 

Five papers read at the Section of Industrial and Busines 
Psychology of the American Association for Applied Psychol- 

419 



EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


ngy in 1940, together with an introductory article, are pre¬ 
sented in this number, but are not separately abstracted 
because of space limitations. In addition to the introductory 
article (title as above), the authors and titles of the papers 
are as follows: “Selection of Casualty and Life Insurance 
Agents” (M. A. Bills); “Recent Research in the Selection of 
Life Insurance Salesmen” (A. K. Kurtz); “A Rcpoit of Re¬ 
search on the Selection of Salesmen at the Tremco Manufac¬ 
turing Company” (C) A. Ohmann); “Procedures for the 
Selection of Salesmen for a Detergent Company” (J. L. Otis); 
and “Selection Research in a Sales Organization” (T. M. 
Stokes). F. A. Kingsbury. 

Lowell, Frances E. “A Study of the Variability of I.Q.’s in 
Retests.” Journal of Applied Psychology, XXV (1941), 
.141-356. 

The main purpose of lids study was to seek corroboration 
of the results obtained in Cleveland Public Schools in recent 
years which seemed to show that the I.Q.’s of school children 
tended in certain instances to vary between test and retest 
The data were composed of 1000 cases that had two tests 
only, 1000 cases that had three tests, and 1000 cases that had 
four tests. The Terman 1916 revision of the Binet was used 
in all tests. It was found that there are significant decrements 
in I.Q, both for groups and for chronological age. Further¬ 
more, the I.Q. range, the chronological age at first test, and 
the interval elapsing between first and last tests may all be 
eliminated as causes for variation in I.Q. on retest. Nor does 
sex influence variations in I.Q. between first and last tests. 
On the average, four times as many cases on retest decrease 
in I.Q. as increase. In particular, those cases that increase 7 
or more points on the first retest decrease 5 times as often as 
they increase on the second retest. The data on the first retest 
seem to indicate that the older the child is, the less chance 
there is that his second I.Q. will increase. J. E. Karlin. 

McCloy, C. H. “The Factor Analysis as a Research Tech¬ 
nique.” Research Quarterly, XII (1941), 22-33. 

420 



MEASUREMENT ABSTRACTS 


This paper presents an elementary discussion of some 
fundamental concepts and limitations of factor analysis. Par¬ 
ticular reference is made to its possible uses in the field of 
health and physical education. Specific examples of precautions 
to be taken and the kind of studies to which this type of 
correlational analysis may be applied are given in terms of 
research in physical education. The method has been utilized 
in (1) studies of motor skills, (2) analysis of anthropometric 
data, (3) analysis of cardiovascular variables, and (4) studies 
of character and personality traits. A 17-item bibliography is 
included. W. A. Varvcl. 


Mosier, C. I. “A Psychometric Study of Meaning.” Journal 
of Social Psychology, XIII (1941), 123-140 
256 adjectives expressing judgmental relationships which 
could be placed along a favorable-neutral-unfavorable con¬ 
tinuum were rated on an 11-point scale by college students in 
psychology. Some 140 ratings were obtained for each word 
“Two basic hypotheses . . . are confirmed: first, that the 
meaning of a word may be considered as if it consisted of two 
parts, one constant and representative of the usual meaning of 
the word, and one variable, representative of individual inter¬ 
pretation in usage and associated context and general usage, 
second, that the frequency with which any particular meaning 
is evoked is descnbable by the Gaussian Law ” The presence 
of words with two discrete meanings, yielding bimodal fre¬ 
quency distributions of responses, was noted. The effect of 
adverbial modifiers on the meaning of an adjective was studied 
“A scale with a rational basis has been developed and values 
describing quantitatively the modal meaning and the ambiguity 
of more than 200 adjectives have been obtained.” W. A. 
Varvel. 


Oral Trade Tests—Group Leaders’ Handbook The Per¬ 
sonnel and Training Section in collaboration with the Local 
Office Operations Section and Chicago Occupational Re¬ 
search Center. Division of Placement and Unemployment 

421 



EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


Compensation, 205 West Wacker Drive, Chicago, Illinois. 

February, 1941. 18 pp, 

This handbook is designed to instruct the group-leaders 
in interviewing in the construction and use of Oral Trade 
Tests. It is based on Oral Trade Questions, Vol . /, prepared 
by Occupational Analysis Section, United States Employment 
Service Division (not available for general use). There are 
three divisions. History of Oral Trade Questions; Prepara¬ 
tion of Oral Trade Questions; and Application of Oral Trade 
Questions in Operating Offices. Specific examples of the appli¬ 
cation of oral trade questions are given. D. A. Peterson. 


Osgood, C. E. and Stagner, Ross. “Analysis of a Prestige 
Frame of Reference by a Gradient Technique. 11 Journal 
of Applied Psychology, XXV (1941), 275-290. 

This study was designed to demonstrate a method for 
analyzing a frame of reference, and to investigate the par¬ 
ticular determinants of the frame of reference known as occu¬ 
pational prestige. Subjects were required to judge a number 
of occupational stereotypes with respect to a psychological 
“gradient” continuum varying from the description “brains" 
on the one extreme to “brawn” on the other. In a second part 
of the test the judgments were made about persons rather than 
occupations. The subjects were 100 Dartmouth College men, 
students in introductory psychology, 50 of whom filled out the 
“job” form and 50 the “person” form. There were 15 names 
of occupations and each was accompanied by a set of ten 
characteristics. It was found that general rankings for prestige 
correlate on the average highly with median judgments on the 
gradient test, but that the reactions on the job forms were 
significantly different from the person forms. Prestige is 
imputed to occupations per sc on the basis of such character¬ 
istics as hopefulness, being noticed, financial return, brains; 
prestige is imputed to men in specified jobs on the basis of 
brains, leadership, and self-assuredness. Since the conditions 
of the experiment are deemed to exclude the possibility of 
conscious verbalization of a prestige frame of reference, it is 


422 



MEASUREMENT ABSTRACTS 


concluded that the mere presentation of a set of occupational 
stereotypes for a seiies of judgments caused the spontaneous 
establishment of a prestige framework which then determined 
in a reliable manner judgments on the specific traits listed. The 
technique is practical and adaptable. J. E. Karlin. 

Powell, N. J. “Check List for Use in Civil Service Objective 
Test Preparation.” Public Personnel Quarterly, II (1941), 
13-16. 

The author has prepared a diagnostic check list designed 
to increase the probability of considering all the major bases 
for appraisal of the test being constructed. Guiding questions 
are listed under each of the construction problems for examin¬ 
ing the individual item and the test as a whole. A dual criterion 
is suggested and instructions for use of check list are given. 
It is emphasized that while the degree of correlation between 
test score and job performance is important, it is not the only 
indicator of adequacy of examination. D. A. Peterson. 

Powell, N. J. “Steps in Written Test Construction.” Public 
Personnel Quarterly, II (1941), 73-76. 

The process of constructing a written test is analyzed, 
assuming that examinations are made public (i.e., a test item 
cannot be used more than once). The following general prob¬ 
lems are treated in outline form: 1. the determination of the 
abilities to be measured; 2. the determination of the test con¬ 
tent which measures the desired abilities; 3. the allocation of 
emphasis; 4. the preparation of the test items; 5. the arrange¬ 
ment and editing of the test items; 6. the experimental tryout; 
7. final test copy; and 8. general considerations.with regard to 
test preparation integrity. D. A. Peterson. 

Rcyburn, H. A. and Taylor, J. G. “Some Factors in Intel- 
ligencc.” British Journal of Psychology, XXXI (1941), 
249-261. 

This study is intended to throw further light on the con¬ 
troversy regarding the unitary functioning of a general factor, 
g, in tests of intelligence. The material consisted of ten tests 

423 



EDUCATIONAL AND 1‘SYCHOLOCHOAL MEASUREMENT 


purporting to measure some aspect of intelligence, the tests 
being formboards, repetition of digits, repetition of digits 
backwards, matching tests, absurdities, Porteus mazes, arith¬ 
metical reasoning, icasoning tests, vocabulary tests, and dis¬ 
sected sentences. The tests were given to 1497 South African 
children with ages ranging from 12 to 18. hive factors were 
extracted from centroid analysis of the inlcr-tcst eonelations. 
The axes were then rotated orthogonally so as to preserve a 
positive manifold and, if possible, retain a general factor 
present in all the tests. It turns out, however, that no general 
factor is present. Three factors are immediate memory span 
(in digits foiwards ami digits backwards), verbal (in dis¬ 
sected sentences and vocabulary) and perceptual dexterity (in 
dissected sentences, matching, mazes); the two other factors 
are present in equal proportions in matching, arithmetic, and 
reasoning. Neither of these two is </ as ordinarily operation¬ 
ally defined; one factor is the ability to find or make a 
significant pattern in a mass of irrelevant material, and the 
other factor is the ability of logical elimination. 'Pile sugges¬ 
tion is made that ij in this battery is complex and that orthodox 
tests of ji need to be constructed to preserve its functional 
unity. J.K.Kmlin. 

Rolf, Merrill. “A Statistical Study of the Development of 

Intelligence Test Performance.” Journal of Psychology, 
XI (1941), 371-386. 

Using data available in the literature, correlations between 
test performance of children at a specific age and the gain in 
their performance one or more years later were estimated. 
The fact that the correlations showed no tendency to increase 
as the interval between test and retest increases indicates that 
the “Constancy of the I.Q." is due primarily to retention of 
earlier skills and knowledge rather than to correlations be¬ 
tween earlier scores and later increments. On the assumption 
that the I.Q. variability is constant, the same procedures were 
used to find correlations which would result if scores and later 
increments were uncorrelated. No comparison of these values 
and empirical findings is made, Lorraine Boulhilet. 


424 



MEASUREMENT ABSTRACTS 


Schellhammer, Fred M. “The Intelligence Test in Teacher- 
Training Institutions.” School and Society, LIII (1941), 
319. 

In a survey of 150 teacher-training institutions, 103 were 
found to use the intelligence test in student selection and 
evaluation of intelligence, 18 relying on it solely and 85 using 
it in conjunction with other techniques such as high school 
records and interview and faculty reports, no one combination 
finding universal favor. The majority of institutions con¬ 
sidered the high school record as important as the intelligence 
test, and were supplementing both measures with subjective 
techniques. Virginia Brown. 


Super, D. E. “A Comparison of the Diagnoses of a Graph¬ 
ologist with the Results of Psychological Tests.” Journal 
of Consulting Psychology, V (1941), 127-133. 

To check the claims of a woman “graphologist,” 24 stu¬ 
dents submitted samples of their handwriting and obtained 
the graphologist’s diagnoses. These were compared with the 
most appropriate of several test scores (Intelligence, Fryer 
& Sparling’s Occupational Intelligence Norms, Strong Voca¬ 
tional Interest, and Bernreuter Personality Inventory). Use 
of chi-square and other methods showed no more than chance 
relationship between occupations recommended and those in¬ 
dicated as suitable for intelligence scores obtained; occupa¬ 
tions rated as unsuitable by interest tests were recommended 
with more than chance frequency; personality traits were 
estimated by the graphologist with no more than chance 
agreement with test scores (on four traits), and worse than 
chance agreement (on two traits ). F. A. Kingsbury. 


Thomson, Godfrey. "Critical Notice of ‘The Factors of the 
Mind’ by Cyril Burt.” British Journal of Educational 
Psychology, XI (1941), 45-51. 

Thomson writes a brief review of Burt s most recent book 
(The Factors of the Mind , Univ. of London Press 1940, xrr 
+ 509). The major portion of the review considers Burts 

425 



EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


suction on the distribution of temperamental types and the 
application of factor analysis to persons as well as to tests. 
Thomson does not agree that the philosophical approach to 
factor Analysis is easier or more illuminating than the geomet¬ 
rical but he does express agreement with Burt's conclusions as 
to the metaphysical status of mental factors. IV. A. Varvel. 

Traxler, Arthur E. and others. “Psychological Tests and 
Their Uses.” Review of Educational Research, XI 
(1941), 1-130. 

This issue, consisting of eight papers not separately 
abstracted because of space limitations, is concerned with the 
construction, evaluation, and application of psychological 
tests. Individual articles are accompanied by extensive bibliog¬ 
raphies. The following is a list of authors and titles: 

1 “Brief Overview of the Period” (Arthur E. Traxler) 
II “Current Construction and Evaluation of Intelligence 
Tests” (Dewey B. Stuit) 

III “Applications of Intelligence Tests” (J. B. Stroud) 

IV “Measurement of Aptitudes in Specific Fields" (David 
Segcl) 

V “Current Construction and Evaluation of Personality 
and Character Tests” (Arthur E. Traxler) 

VI “Projective Methods in the Study of Personality” 
(Perrival M. Symonds) 

VII “Applications of Personality and Character Measure¬ 
ment” (John W. M. Rothney) 

VIII “Statistical Methods Related to Test Construction and 
Evaluation” (John C. Flanagan) D. A. Peterson. 

Traxler, Arthur E. “Stability of Scores on the Primary Men¬ 
tal Abilities Tests.” School and Society, LII1 (1941), 
255. 

Test-retest correlations after one year ranging, with one 
exception, from .578 to .917 were found for the scores of 104 
pupils in grades X-XII on Thurstone’s Primary Mental Abili¬ 
ties Tests. The guidance value of the perceptual, memory, 
and inductive tests may be limited, for their correlations fell 


426 



MEASUREMENT ABSTRACTS 


below .80. These results should be checked with a larger and 
more representative group as the sampling and number of 
cases were not adequate in the present study. Virginia Brown 

f lie Use of Tests in the Illinois State Employment Service 
The Personnel and Training Section in collaboration with 
the Local Office Operations Section. Division of Place¬ 
ment and Unemployment Compensation, 205 West Wacker 
Drive, Chicago, Illinois. February, 1941. 11 pp. 

This pamphlet is intended to assist interviewers in the use 
of test results as a supplementary tool in “making more 
objective the evaluations which must be made during the inter¬ 
view.” The use of tests is related to other interviewer’s 
tools (i.e., Job Descriptions, The Dictionary of Occupational 
Titles, Registration and Placements Aids). The article de¬ 
scribes the types of tests, proficiency and aptitude tests, used 
in aiding the interviewer to evaluate work skills. Aptitude 
test batteries have been developed for three fields: selling, 
clerical work, and manual work. Three graphic illustrations 
of relation of scores on aptitude tests to job performance are 
given. D. A. Peterson 


Viteles, M. S. “A Psychologist Looks at Job Evaluation.” 
Personnel Journal, XVII (1941), 165-176. 

The author recognizes the importance of job evaluation as 
a basic feature of the industrial relations program. In pro¬ 
moting the adjustment of workers, there is a need for a pro¬ 
cedure designed to establish an equitable basis of compensa¬ 
tion, to facilitate transfer and promotion, and to eliminate 
duplication of activities. The chief consideration of the paper 
is a critical examination of the various types of job evaluation 
programs in the light of psychological principles and experi¬ 
ence. Ways are indicated in which improvements might be 
effected through the application of the techniques and prin¬ 
ciples of applied psychology. The present trial and error 
approach could be converted into a i rational, logical, and 
scientific system of analysis.” IV. A. Varvel. 

427 



BDUl’/YI IONA I, AND PSYOHOIXK.K'Af M Iv AS U K KM KNT 


Welch, Alfred C. “An Analytic System of Testing Competi¬ 
tive Advertising.” Journal of Applied Psychology, XXV 
(1941),176-189. 

This study is intended to correct the usual copy-test pro¬ 
cedure, which is defective in that it yields only a gross evalua¬ 
tion, by combining into a undied system a number of different 
tests which will provide the advertiser with clues to help him 
improve his advertising An analytic system of testing com¬ 
petitive advertising was developed to provide a method of 
suggesting specific strong and weak aspects of an advertising 
campaign as well as to provide a gross evaluation of the effects 
of the campaign. The system is based upon four tests: A 
Brand Preference scale (described previously); a Brand 
Familiarity scale (controlled association nr aided recall) in 
which the respondents were required to name live brands in 
response to each of two stimulus-words, cigarettes and fountain 
pens; a rheme Familiarity test ( Link's method of triple asso¬ 
ciates) in which the respondent must identify the sponsor of a 
particular advertising theme; and a Theme Credence test (a 
belief test I hat does not require the respondent to report 
directly whether he believes an advertising claim). Tests ol 
reliability and validity for the various scales indicate that the 
Brand Familiarity, rheme Familiarity, and Theme Credence 
Tests were useful supplements to the Brand Preference scale in 
analyzing the effects of advertising but that none of the three 
tests could he depended upon as a valid measure if used alone. 
Examples of the use of the analytic system are given. /. E. 
Karlin. __ 

Wells, F. L, “Some Functions of Mental Measurements in 
the Young Superior Adult.” Journal of Consulting 
Psychology , V (1941), 1 OS-110. 

A review of cases seen through the psychiatric division 
of the student health department in a large endowed univer¬ 
sity reveals about ten classes of adjustment problems. These 
are distinguished by different patterns of performance on the 
various standa-rd--fe- x - »m i nati -o-n—tcchniruics. Representative 
cases /‘drill'^ itlsut|P l jifc)ff(|Ki described. F. A. 

Y & LOGUf.K ITAT10M I 




